The present application is related to the U.S. applications, “METHOD OF DETERMINING A MAXIMAL MESH”, Natarajan et al., Ser. No. 10/354,991, “METHOD OF DETERMINING A MESH IN A COMPUTER NETWORK”, Walker, et al., Ser. No. 10/355,062, and “METHOD OF STORING DATA CONCERNING A COMPUTER NETWORK” Ho et al., Ser. Number 10/355,118. Each of these application is are filed on the same day as the present application and is incorporated herein by reference.
Computer network nodes are often arranged in redundant arrangements to provide reliability. These redundant arrangements can make the analysis of the computer networks more difficult.
When failures occur in the computer network hardware, parts of the network can become inaccessible to other parts of the network, causing “down time”. To address the “down time” network management software applications can be deployed to help anticipate failures and find a root cause of any failures which occur. An example of a network management software application is described in Walker et al., U.S. Pat. No. 6,061,723 “Network Management Event Correlation in Environments Containing Inoperative Networking Elements”, incorporated herein by reference.
In accordance with exemplary embodiments, a method of indicating a path in a computer network is provided wherein a background process is executed which dynamically determines additional nodes in a path from a first node to a second node. The additional nodes include a routing node. The background process dynamically examines a routing table in the routing node to determine changes in the path. Information is provided concerning the path, the information including an indication of the additional nodes.
A method of identifying a point of primary failure among nodes of a computer network includes storing information regarding plural paths between a first node and second node of the network. The information indicates a third node of the network on one of the plural paths. The second and third nodes are polled to determine their accessibility by the first node. The stored information and the accessibility of the second and third nodes are used to determine a point of primary failure.
An exemplary management computer can comprise a processor configured to execute a background process which dynamically determines additional nodes in a path from a first node to a second node. The additional nodes include a routing node. The background process dynamically examines a routing table in the routing node to determine changes in the path. The management computer includes a memory configured to store information concerning the path, the information including indications of the additional nodes.
The accompanying drawings provide visual representations which will be used to more fully describe the representative embodiments disclosed herein and can be used by those skilled in the art to better understand them and their inherent advantages. In these drawings, like reference numerals identify corresponding elements and:
In step 102, a background process is executed which dynamically determines additional nodes in a path from a first node to a second node. As referenced herein, the term “dynamically” means that nodes are determined at any time during execution of the background process (e.g., periodically or at random intervals during the execution of the background process). The background process can be a daemon process such as those used with UNIX systems. The nodes can include end nodes; routing nodes, such as routers that operate using Internet Protocol (IP) Addresses; and non-routing nodes, such as switches that operate using link level addresses. As used herein, a “path” is a possible communicate link between the first node and the second node. The paths can include active path segments allowed by the current spanning tree, and/or path segments allowed by the network topology but not allowed by the current spanning tree. The background process can be a path server that provides path information to client processes.
The additional nodes can include a routing node. The background process dynamically examines a routing table in the routing node to determine changes in the path. As used herein, “routing table” refers to a table used by a routing node to route packets in a network and/or stored routing information maintained for a management protocol or for another purpose.
In step 104, information is provided concerning the path. The information includes an indication of the additional nodes.
In an exemplary embodiment, the background process is configured to use a network socket to provide information to another process. A network socket is one way of transferring information between the background process and another process.
In the example of
The routing table can be accessed using a network management protocol. For example, the Simple Network Management Protocol (SNMP) can be used to access Management Information Base (MIB) data that indicates the contents of the routing table at the routing node. In the example of
In an exemplary embodiment, the information provided concerning the path identifies interfaces of the additional nodes. In
The path information can include indications of any non-routing nodes in the path. In the above example, the indicated non-routing nodes of the path are nodes SC, SF, SG, and SI.
The non-routing nodes can be determined from a stored topology data. In the example of
The path information can indicate plural paths from the first node to the second node. For example, in
In an exemplary embodiment, the system indicates whether there is any mesh included in the path. In the example of
The path information can be provided to another process that polls the nodes to determine accessibility. In the example of
In one embodiment, the process uses the path information to identify a point of primary failure. The information concerning the path helps to determine the point of primary failure.
An exemplary Background Process Flow is as follows:
Background Process Flow
In this example, the process initialization step involves integrating the process with the system. For example, when the background process is to be used with the Hewlett-Packard OpenView environment, integration can be performed at the initialization of the background process, for starting and stopping the background process and for obtaining status information from the background process during operation.
In a next step, the process waits for the network topology to be determined by the topology unit.
In the outer loop of the background process flow, the background process 206 creates a listen socket. A listen port socket can be created at port 3209. The path engine background process waits for the client to connect with this listen port. In the example of
In the inner loop of the path engine background process, if there is a new node discovery, a list of the routing nodes and end nodes the computer network is obtained from the topology unit. For each routing node or end node in the list of the routing nodes obtained from the topology unit, the system loops and computes the composite path and sends this information to the client. If the socket is bad, the system breaks to the outer loop and creates a listen socket. In one example, the system can sleep for any desired time interval(s) (e.g., five seconds for the first time through the loop and twenty seconds the next time through the loop). If there is a new discovery, the node cache is cleared. The node cache stores path segment information of non-routing nodes.
An example of a Compute Composite Path procedure for determining a composite path is as follows:
Compute Composite Path
In this example, the Compute Composite Path procedure first finds an active route indicating the routing and end nodes from the first node to the second node. In an exemplary embodiment, the active route can be a route from the first node to the second node via a particular routing node or nodes (e.g., a route from first node MA to second node EJ via routing nodes RB and RH). For each pair of successive routing and/or end nodes in the route, non-routing nodes in a path between the nodes can be obtained from the node cache.
In an exemplary embodiment, the Compute Composite Path procedure is not called to identify paths to switching nodes. However, such a feature can, if desired, be performed using a management IP address of a switching node. If the node cache is empty or cleared, the non-routing nodes can be determined from stored topology data. Non-routing nodes which are found can be loaded into the node cache.
An example of a Find Active Route procedure to identify the routing nodes in a given route to an end node is as follows:
Find Active Route
This exemplary Find Active Route procedure operates by setting the current IP address to the first node (management computer) IP address. Until the next node is the second node, a routing table at the current IP address is checked using a network management protocol, such as the Simple Network Management Protocol.
In one embodiment, the routing table is checked by examining a number of different possible subnets stored in the routing table. In one example, a shifting subnet mask can be used to determine a most specific IP address match in the routing table (e.g., the most specific IP address match). An exemplary examining of the routing table using subnets is also discussed in the U.S. patent application Natarajan et al. “Method of Computing a Path Between Two Nodes in a Network”, Ser. No. 10/154,912, filed May 28, 2002, incorporated herein by reference. In one example, a 32 bit subNetMask (e.g., 32 “1's”) is first used and it is checked to see if a next hop entry exists in the routing table. If not, the mask is left shifted and comparisons are made until a next hop entry is found. When a specific NextHop is found (e.g., the most specific NextHop entry in the table), the procedure checks the routing table at the NextHop.
The IP address of the next routing node or end node (next hop) in the path to the second node is determined from the routing table. The next hop is added to the active route list, and the current IP address is set to the next hop IP address. The loop is continued until the next hop is the second node. When the next hop is the second node, the active route list is returned to the Compute Composite Path.
An example of a Determine Non-Routing Nodes procedure (e.g., to find switching nodes between each pair of successive routing nodes and/or end nodes) is as follows:
Determine Non-Routing Nodes
In this example, the Determine Non-Routing Nodes procedure receives a pair of routing nodes or a routing node and an end node. The first node of the pair is set as the newly-found node. Until the second node of the pair is found, the process loops. The newly found nodes are set as the last-found nodes. The topology database is checked to find nodes connected to the last-found nodes. The connected nodes that are not on a visited node list are set as the newly-found nodes, and the partial path segment for each newly-found node is determined. The partial path segment is a path segment from the first node of the pair to the newly-found node. When the second node of the pair is a newly-found node, a partial path segment history for the second node of the pair is set as the path segment history.
The path segment history is then checked to determine alternate paths. In one embodiment, this alternate path information is mesh information that describes meshes of nodes within the computer network. Alternate path indications can be inserted into the path segment history. In one embodiment, indications of a mesh are also inserted into the path segment history. The path segment history is then returned to the Compute Composite Path procedure.
The non-routing nodes can be determined by checking a topology database that indicates the interconnections between the non-routing nodes. That is, the interconnections between the different switching, but non-routing, nodes in the computer network can be stored in a database. As shown in
The determination of the non-routing nodes in one embodiment uses a pair of routing nodes. Starting from the first node of the pair, nodes connected to that first node are found. The procedure can then go to find the nodes that are connected to nodes that are connected to the start node. The process then continues until a first path between the first and second nodes of the pair is found. In one embodiment, when the path can be found, this path is used to determine stored alternate paths or meshes.
In the
The system computes a composite route to the routing nodes. In one example, when the composite path to EJ is produced, the background process 206, using the IP Address of the node EJ, checks the routing table at the management node MA. This indicates the routing node RB. The routing table of RB indicates the second routing table RH. The routing table at the routing node RH indicates the node EJ. The determination of the next routing node in the path can be achieved using management queries from the background process 206 to the routing tables. Dynamic changes to the routing tables can be found using the background process 206. For example, if the system were to change to route through routing nodes RM and RN to get the node EJ, these changes would be found by the queries produced by the background process 206.
Along the route between the pair of routing nodes RB and RH, the non-routing nodes are determined. These non-routing nodes can be determined by checking the topology unit 204. In the example of
The network monitor 208 can be located at the management computer MA. The background process 206 can be located at the management computer MA or at another node in the computer network or a computer that can access the computer network. Topology unit 204 can be located at the management computer or at another location.
In one example, the background process 206 is the path engine used in the Hewlett-Packard OpenView Network Node Manager (NNM) available from Hewlett Packard of Palo Alto, CA. The background process 206 can be a path engine that interacts with the network monitor 208. The network monitor 208 can include its own topology store, an event correlator system and an event configurator system. The topology unit 204 can include a host NNM unit, discovery process (DISCO), a combination element (RENDEVOUS), a topology database (OVET_MODEL), as well as the mesh discovery unit. Descriptions of the mesh discovery are described in the U.S. patent applications “METHOD OF DETERMINING A MAXIMAL MESH” (Natarajan et al., Ser. No. 10/354,991) and “METHOD OF DETERMINING A MESH IN A COMPUTER NETWORK” (Walker et al., Ser. No. 10/355,062), which are incorporated herein by reference. Interconnections between the network monitor 208 and the topology unit 204 can be performed using a translator (e.g., a bridge) for consistent labeling of the nodes and interfaces in the system.
Since the background process 206 can do active routing table queries to the routing nodes, the system will correctly operate with VLAN systems which use a special VLAN switch card, such as Cisco RSM card, placed within a switch. The VLAN switch card includes a router which has a single port connection for all of the VLANs. Since the background process 206 determines each of the routing nodes, the VLAN switch card can be found as a routing node. The remainder of the switch can be found as a non-routing node.
In one embodiment, the information transferred between the background process 206 and the network monitor 208 can be extensible mark-up language blocks (XML blocks). In one embodiment, the indications of a plural path can be indicated by XML tags. For example, XML tags can be used to indicate a mesh. In one embodiment, all of the interface connections of the mesh are produced in between the XML tags indicating the mesh. Thus, once the path is determined, segments which are part of a mesh can be replaced by the mesh indication.
The information can be stored temporarily or permanently. The memory can store the information for transfer to another process operating at the computer or another location. The memory can be a buffer which stores the information before the information's use by another process. The memory can store the information for a network socket.
In step 402, information is stored regarding plural paths between a first node and a second node of the network. The information can indicate a third node in the network on one of the plural paths. Paths allowed by the network topology can be considered to be a path whether or not a current spanning tree can be polled to determine their accessibility by the first node, in step 404. In step 406, the stored information and the accessibility of the second and third node are used to determine a point of primary failure.
In the example of
In an exemplary embodiment, the information indicates plural paths by indicating a mesh. In the example of
In an exemplary embodiment, the information indicates interfaces of the nodes. The information can include indications of any non-routing nodes in the plural paths between the first and second nodes.
Additional information of a path between the first node and the fourth node can be stored. In the example of
An example of a Determine Primary Failure procedure is as follows:
Determine Primary Failure
In this example, all of the interfaces in the computer network are polled to determine the accessible and inaccessible nodes from the first node (MA). If any of the interfaces are inaccessible, for each of the inaccessible nodes it is checked to see whether there is a path through accessible interfaces in the critical route (path or paths between first and second node) to the current inaccessible node. If so, the failure is a primary failure. If there is not a path through accessible interfaces in the critical route to the current inaccessible node, the failure is a secondary failure.
In the example of
Until the spanning tree is changed, the failure at the interface 3 of the switch SC is a primary failure since neither node EJ nor any other interfaces in the path to node EJ are accessible from MA while the interfaces 1 and 2 of RB as well as interface 1 of SC are accessible.
The failure at interface 3 of SC can be determined by the spanning tree algorithm. The spanning tree algorithm can then produce a modified spanning tree, such as the spanning tree that connects SF to SD and then connects SD to SC, rather than directly connecting between SC to SF. When the spanning tree reroutes, the failure at interface 3 of SC is not a primary failure since additional interfaces in the path toward the second node EJ are now accessible. For example, interface 2 of SC, interfaces 1 and 3 of SD, interfaces 1 and 4 of SF, and interfaces 1 and 2 of SG are now accessible. Thus, interface 3 of SC, once the spanning tree is rearranged, is not indicated as being a primary failure. Additionally, since the information concerns multiple paths, the different arrangements of the spanning tree can be anticipated. For example, a report of a primary failure at interface 3 of SC can be delayed since it is possible that the spanning tree algorithm will reconfigure around the interface 3 of switch SC.
The different paths indicated by the information may not all be active at one time. Using the spanning tree algorithm only one path segment can be set active. The other paths will be inactive but can be set active by a modification of the spanning tree.
In the example of
It will be appreciated by those of ordinary skill in the art that the invention can be implemented in other specific forms without departing from the spirit or character thereof. The presently disclosed embodiments are therefore considered in all respects to be illustrative and not restrictive. The scope of the invention is illustrated by the appended claims rather than the foregoing description, and all changes that come within the meaning and range of equivalents thereof are intended to be embraced herein.
Number | Name | Date | Kind |
---|---|---|---|
5435003 | Chng et al. | Jul 1995 | A |
5956339 | Harada et al. | Sep 1999 | A |
5983350 | Minear et al. | Nov 1999 | A |
6003090 | Puranik et al. | Dec 1999 | A |
6061723 | Walker et al. | May 2000 | A |
6072866 | Conan | Jun 2000 | A |
6201794 | Stewart et al. | Mar 2001 | B1 |
6298053 | Flammer et al. | Oct 2001 | B1 |
6470389 | Chung et al. | Oct 2002 | B1 |
6747957 | Pithawala et al. | Jun 2004 | B1 |
6804712 | Kracht | Oct 2004 | B1 |
6813634 | Ahmed | Nov 2004 | B1 |
6836463 | Garcia-Luna-Aceves et al. | Dec 2004 | B2 |
6847614 | Banker et al. | Jan 2005 | B2 |
6934249 | Bertin et al. | Aug 2005 | B1 |
6977908 | de Azevedo et al. | Dec 2005 | B2 |
6980548 | Zaccone et al. | Dec 2005 | B1 |
7013345 | Brown et al. | Mar 2006 | B1 |
7185045 | Ellis et al. | Feb 2007 | B2 |
20020004843 | Andersson et al. | Jan 2002 | A1 |
20020015386 | Kajiwara | Feb 2002 | A1 |
20020018449 | Ricciulli | Feb 2002 | A1 |
20020091857 | Conrad et al. | Jul 2002 | A1 |
20020143905 | Govindarajan et al. | Oct 2002 | A1 |
20020165981 | Basturk et al. | Nov 2002 | A1 |
20020181402 | Lemoff et al. | Dec 2002 | A1 |
20030041138 | Kampe et al. | Feb 2003 | A1 |
20030058789 | Sugawara et al. | Mar 2003 | A1 |
20030072271 | Simmons et al. | Apr 2003 | A1 |
20030163528 | Banerjee et al. | Aug 2003 | A1 |
20030189920 | Erami et al. | Oct 2003 | A1 |
20030191829 | Masters et al. | Oct 2003 | A1 |
20040030924 | Griswold | Feb 2004 | A1 |
20040042402 | Galand et al. | Mar 2004 | A1 |
20040081166 | Stanforth et al. | Apr 2004 | A1 |
20040151121 | Natarajan et al. | Aug 2004 | A1 |
20040153568 | Ho et al. | Aug 2004 | A1 |
20040156321 | Walker et al. | Aug 2004 | A1 |
Entry |
---|
Apostolopoulos et al. “On the Effectiveness of Path Pre-Computation in Reducing the Processing Cost of On-Demand QoS Path Computation”; IEEE; 1998. |
Number | Date | Country | |
---|---|---|---|
20040153572 A1 | Aug 2004 | US |