1. Field of the Disclosure
The present disclosure relates generally to packet-switching networks and, more particularly, to systems and methods for tracking buffer statistics of switches.
2. Description of Related Art
Typically, in packet-switching networks, switches maintain buffer statistics that can be polled by network administrators. Thus, the network administrator would periodically poll the buffer statistics from the switches to identify any potential issues that may require the attention of the network administrator. Alternatively, the network administrator would poll the buffer statistics in response to a problem that has arisen with reference to a particular switch. Unfortunately, for large networks having hundreds (or even thousands) of switches, the amount of polled data can be staggering, and reviewing the sheer volume of data can be a herculean task.
Many aspects of the disclosure can be better understood with reference to the following drawings. The components in the drawings are not necessarily to scale, emphasis instead being placed upon clearly illustrating the principles of the present disclosure. Moreover, in the drawings, like reference numerals designate corresponding parts throughout the several views.
Typically, in packet-switching networks, switches maintain buffer statistics that can be polled by network administrators. Thus, when a switch encounters an issue (e.g., a queue length becoming too long), the network administrator can poll the buffer statistics from the problematic switch to isolate the issues and provide remedial action. Unfortunately, by the time the buffer statistics are polled, the problem has already occurred (e.g., network congestion), and the network administrator's role at that point is to simply fix the problem (re-route network traffic to other switches).
In the alternative, the network administrator can periodically poll the buffer statistics in an effort to avert problems. In short, by reviewing the buffer statistics, the network administrator can reallocate available resources to prevent problems. Unfortunately, for large networks having hundreds (or even thousands) of switches, the amount of polled data can be staggering, and reviewing the sheer volume of data can be a herculean task. This is because polling all of the switches for buffer statistics results in the collection of information for many switches that are operating normally as well as for the few switches that are on the brink of failing. Thus, sifting through the morass of data to identify a potential problem becomes a time-consuming task.
The systems and methods disclosed herein allow for a switch to track buffer statistics, and trigger an event, such as a hardware interrupt or a system snapshot, in response to the buffer statistics reaching a threshold that may indicate an impending problem. Since the switch itself triggers the event to alert the network administrator, the network administrator no longer needs to sift through mountains of data to identify potential problems. Also, since the switch triggers the event prior to a problem arising, the network administrator can provide remedial action prior to a problem occurring. This type of event-triggering mechanism makes the administration of packet-switching networks more manageable.
To implement the event-triggering mechanism, the switching system includes a use-count register that stores a use-count that is associated with a switch buffer. This use-count can be indicative of egress statistics, ingress statistics, or device statistics, and can be monitored on a per-queue basis, a per-port basis, or a per-pool basis. The system also comprises a threshold register that stores relevant threshold values associated with each use-count. A memory management unit (MMU) in the system compares the use-count with the threshold to determine whether the use-count exceeds the threshold. And, when the use-count exceeds the threshold, the MMU triggers an event. The event can be a hardware central processing unit (CPU) interrupt that alerts a network administrator that the switch buffer has exceeded a certain threshold. Also, the event can be a snapshot of the switch when the threshold was exceeded. As such, the event can be an issuance of a command to stop updating the use-counts. As an alternative, the event can simply be the setting of a dedicated bit in a register, which, when polled, brings to the attention of the network administrator the particular switch in which the bit was set.
In addition to implementing the alerting mechanism on an actual buffer, a similar mechanism can be implemented on virtual queues. Thus, rather than monitoring a real queue (or the actual physical port), a virtual queue can be implemented using, for example, a token bucket meter. Since token bucket meters are known in the art, only a brief discussion is provided here. The virtual queue mimics the real queue by shadowing the behavior of the real queue. However, since the virtual queue can be configured to drain at a slower rate than the real queue, the use-counts on the virtual queue will accumulate faster than the use-counts on the actual queue. This results in an earlier detection of potential problems, and further reduces or eliminates any latency impacts from queue build-up.
Having provided a brief overview, reference is now made in detail to the description of the embodiments as illustrated in the drawings. While several embodiments are described in connection with these drawings, there is no intent to limit the disclosure to the embodiment or embodiments disclosed herein. On the contrary, the intent is to cover all alternatives, modifications, and equivalents.
Once the type of statistic has been determined 110, the buffer statistics monitoring mechanism for the switch (simply referred to herein as the switch itself) tracks 120 a use-count for that statistic. The tracking mechanism can be as simple as an incremental count-register that increments the use-count sequentially. One embodiment of a method for tracking 120 use-counts is shown with reference to
The switch then compares 130 the tracked use-count with a threshold, which can be stored in a threshold register. If the use-count does not exceed the threshold, then the system continues to track 120 the use-count. Alternatively, if the use-count exceeds the threshold, then the system triggers 140 an event (e.g., hardware interrupt, system snapshot, etc.). One embodiment of a process for triggering 140 the event is shown with reference to
It is worthwhile to note at this time that, in order to obtain acceptable performance, a network administrator or operator may need to tune MMU buffer admission settings, which are often traffic-dependent. In other words, performance of the event-triggering mechanism depends on setting an appropriate threshold value. For example, if the threshold level is set too high, then the use-count can readily exceed an alarming value without triggering the event. Alternatively, if the threshold level is set too low, then even acceptable (and normal) use-counts can unnecessarily trigger the event.
Once the event has been triggered 140, the system then waits 150 for a network administrator to reset the use-counts and resets 160 the use-counts when the reset command is issued. Upon resetting 160 the use-counts, the system returns to tracking 120 the use-counts.
As noted above,
As shown in
Similar to how egress statistics were tracked 310 in
As one can see, multiple different variables can be monitored by the switch. And, in response to a particular variable exceeding a threshold (or alarm level), the switch triggers a hardware event that alerts the network administrator.
In addition to generally describing methods for triggering an event based on buffer statistics, this disclosure also provides specific parameters for count registers and their values for implementing the tracking of buffer statistics. The decision on whether to store current use-counts or maximum use-counts can be based on a 1-bit register, where a O-value tracks current use-counts and a 1-value tracks maximum use-counts. Similarly, a 1-bit tracking-enable bit can be set so that a O-value stops updating or capturing use-counts while a 1-value continues to update use-counts.
The buffer statistics, or the use-count values, are preferably stored in a register, where each sequential increment of the register represents an increase in use-counts, in units of buffers. One having skill in the art will appreciate that the counter size should be sufficient to handle a worst-case total buffer usage. Also, a profile index can be used to identify which threshold corresponds to which use-count value.
Each of the triggering events can also be identified by a 1-bit value. Thus, for example, a O-value for hardware interrupt may designate that no CPU interrupt be issued, while a 1-value results in the issuance of a CPU interrupt. Similarly, a O-value for snapshot may result in continual updates of the use-counts, while a 1-value for system snapshot may result in the use-counts being frozen.
For each triggering event, a register can be used to indicate what type of use-count (e.g., egress queue total, egress queue-group shared, egress port BP shared, CPU queue total, etc.) triggered the event, while another register provides the identity of the port, pool, or queue that caused the triggering event. For triggers based on port numbers and buffer pools, an 8-bit register can be employed in which the first six bits represent the port number, while the remaining two bits represent the pool number. As one can see, the size of the registers for storing this information can be customized to accommodate the types of use-counts, the maximum values of the use-counts, the number of devices that can trigger the use-counts, etc. Since there are countless ways in which these use-count registers can be configured, and since one having skill in the art can readily implement the use-count registers from the above-recited description, additional examples with specific bit-values are omitted here.
Suffice it to say that by implementing a procedure in which use-counts are used to trigger an event, a network administrator can readily examine how a packet-switching network is performing on a finer granularity without polling all of the buffer statistics for all of the switches. This type of tracking mechanism allows for a more streamlined review of the network performance, thereby allowing the network administrator to reconfigure network components to optimize system performance.
The memory management unit (MMU) may be implemented in hardware, software, firmware, or a combination thereof. In the preferred embodiment(s), the MMU is implemented in hardware using any or a combination of the following technologies, which are all well known in the art: a discrete logic circuit(s) having logic gates for implementing logic functions upon data signals, an application specific integrated circuit (ASIC) having appropriate combinational logic gates, a programmable gate array(s) (PGA), a field programmable gate array (FPGA), etc. In an alternative embodiment, the MMU is implemented in software or firmware that is stored in a memory and that is executed by a suitable instruction execution system.
Any process descriptions or blocks in flow charts should be understood as representing modules, segments, or portions of code which include one or more executable instructions for implementing specific logical functions or steps in the process, and alternate implementations are included within the scope of the preferred embodiment of the present disclosure in which functions may be executed out of order from that shown or discussed, including substantially concurrently or in reverse order, depending on the functionality involved, as would be understood by those reasonably skilled in the art of the present disclosure.
The triggering of the events can be performed by hardware or software code, which comprises an ordered listing of executable instructions for implementing logical functions, can be embodied in any computer-readable medium for use by or in connection with an instruction execution system, apparatus, or device, such as a computer-based system, processor-containing system, or other system that can fetch the instructions from the instruction execution system, apparatus, or device and execute the instructions. In the context of this document, a “computer-readable medium” can be any means that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device. The computer-readable medium can be, for example but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, device, or propagation medium. More specific examples (a nonexhaustive list) of the computer-readable medium would include the following: an electrical connection (electronic) having one or more wires, a portable computer diskette (magnetic), a random access memory (RAM) (electronic), a read-only memory (ROM) (electronic), an erasable programmable read-only memory (EPROM or Flash memory) (electronic), an optical fiber (optical), and a portable compact disc read-only memory (CDROM) (optical). Note that the computer-readable medium could even be paper or another suitable medium upon which the program is printed, as the program can be electronically captured via, for instance, optical scanning of the paper or other medium, then compiled, interpreted or otherwise processed in a suitable manner if necessary, and then stored in a computer memory.
Although exemplary embodiments have been shown and described, it will be clear to those of ordinary skill in the art that a number of changes, modifications, or alterations to the disclosure as described may be made. For example, while use-counts are disclosed herein, it should be appreciated that packet rates can be monitored in a similar fashion. All such changes, modifications, and alterations should therefore be seen as within the scope of the disclosure.