The discussion below describes a technique for monitoring the data ports of the switches in a switch cluster of a network, and taking certain actions if a port of any of the switches is not operating properly. A monitoring entity program is provided in the RADIUS/AAA server 34 and a control entity program is provided in the switches 16-22. As discussed above, periodic or interim accounting messages are transmitted from the switches 16-22 to the server 34 for documenting the activity on the ports, such as the type of data, amount of data (bytes/sec), etc. The control entity program can gather and add dynamic data to the accounting messages including load on the port, protocols configured on the port and its virtual local area network (VLAN), adjacencies that the port has formed with neighboring switches, etc. Using this information, the monitoring entity program can form a brief topology of the switch cluster.
The periodic accounting messages for a particular port may indicate how much data is currently propagating through the port. The monitoring entity program in the RADIUS/AAA server 34 will look at the stored configuration records to determine how much data should or can be propagating through the port. The monitoring entity program compares the accounting messages from the switches 16-22 to the switch configuration records already stored in the RADIUS/AAA server 34 during the initialization period when the ports were made active in the switches. If the accounting messages do not match the configuration records, then the RADIUS/AAA server 34 will send switch control messages to the control entity program in the switch having the malfunction port to tell the switch how to correct the problem or what action should be taken. In one example, the switch control messages tell the control entity program to turn off a particular port, or turn on one or more other ports to handle increased data flow. The control entity program may turn off a particular port if it is connected to an inactive switch, or if data packets are in a continuous loop with another switch sending the data packets back and forth. In this manner, the server 34 provides quality of service (QOS), port control and load balancing across the switches 16-22.
The following example illustrates a communication between the monitoring entity program and the control entity program. Assume that the accounting records for a switch A indicate that a port on the switch A connected to a switch B, which is in the same cluster as switch A, is overloaded with data. The monitoring entity program can take two separate actions. First, it can tell the control entity program to turn off the overloaded port, or tell the control entity program that it has permission to open more ports to the switch B. The control entity program on the switch A will take the necessary action as indicated by the switch control message. The next set of accounting messages provided by the switch A will inform the server of the latest state of the ports on the switch A.
When the switch 32 and the server 34 go through the authentification process when the port is made active, the return messages from the RADIUS/AAA server 34 identify how many of the ports of the switch are activated. For example, if all of the ports of one switch are connected to the ports of another switch, the RADIUS/AAA server 34 will determine how many of those ports need to be opened to transmit the desirable amount of data therebetween. Additionally, the monitoring entity program can determine that the switch has too many ports open for the current amount of data being propagated therethrough. In that situation, the monitoring entity program can instruct the control entity program to deactivate one or more of the ports.
As mentioned above, the RADIUS/AAA server can also determine whether two or more of the switches are in a continuous loop where the same data packet or packets are being transmitted through the switches 16-22 in a loop manner. If the monitoring entity program in the server 34 does detect such a loop, it can shut off one or more of the ports in the switches 16-22 to prevent the continuous loop, and then provide an indication to the user that the port has been disabled.
At a deeper level, the accounting information sent to the server 34 can be used to determine the correctness of the layer 3 routing configurations, such as open shortest path first (OSPF) protocol and routing information protocol (RIP), with respect to the desired topology, and automatically correct configuration errors. The accounting information can also be used for layer 2 switching configurations. The accounting information can also be used to set up filters based on desired/undesired traffic streams. Further, based on the configuration information, the server 34 can draw out pictorial configurations of the switches 16-22 so that the administrator can make fewer adjustments.
If the identity information is valid, the server will send a message back to the switch indicating that the port can be activated. The server will then periodically receive the interim accounting records once the port is activated at box 56. The monitoring entity algorithm will compare the interim accounting records to the stored configuration records for the port at box 58. If the comparison between the accounting records and the configuration records are proper, then the monitoring entity algorithm returns to receiving the interim accounting records at the box 56. If there is a problem between the accounting records and the configuration records at the box 58, then the monitoring entity algorithm issues a switch control message to the control entity algorithm in the switch at box 60. The control entity program will then perform the command in the switch control message at box 62 as discussed above to shut down the port, open other ports, provide an error signal to the user, etc.
The process as described above provides a centralized and automated technique to detect and correct switch cluster configuration problems before the network is impacted. Thus, it will reduce down time of the network due to erroneous or non-optimal switch configurations. Further, every switch in the cluster need not run complicated algorithms to detect loops and perform load distribution. Redundancy mechanisms built into the RADIUS/AAA protocol can provide robustness. This can help reduce switch software size and complications.
The foregoing discussion discloses and describes merely exemplary embodiments. One skilled in the art will readily recognize from such discussion, and from the accompanying drawings and claims, that various changes, modifications or variations can be made therein without departing from the spirit and scope of the embodiments as defined in the following claims.