This application claims the benefits of U.S. patent application Ser. No. 10/224,508, which was filed Aug. 19, 2002; Ser. No. 10/224,353, which was filed Aug. 19, 2002; and Ser. No. 10/231,788, which was filed Aug. 29, 2002, and all of which are incorporated by reference herein.
The present invention pertains to apparatus and methods for improving communications in digital networks. More particularly, the present invention relates to traffic queuing structures and methods used to buffer and shape traffic.
Traffic management is important in digital networks. Traffic management involves the controlling and scheduling of traffic through paths established through the network. One design consideration faced by traffic management systems is the concept of head-of-line blocking, in which two or more streams of traffic are competing for the same resources. In head-of-line blocking schemes, it is a goal to protect the stream(s) of traffic that is/are in profile, and to push back on the offending stream(s).
To perform cost effective shaping of traffic, a traffic shaping device should be able to shape a large number of traffic profiles, as shown in
Accordingly, a system and method are needed to avoid head-of-line blocking at line rate in order to allow for better profiling, shaping and servicing of traffic. Furthermore, improvements are needed that enable a higher utilization of ports served by a queuing structure.
A prioritizing and queuing system are provided for managing traffic in a network in a manner that avoids head-of-line blocking by providing a mechanism to bypass traffic and that does not hinder overall performance. The system provides a hardware solution for latency sensitive traffic. The solution is able to dynamically manage this bypass for a large number of streams of traffic (both active and blocked) in order to be cost effective.
According to one aspect, a system is provided for prioritizing and queuing traffic from a plurality of data streams. The system includes a queuing structure, processing circuitry, and a search engine. The queuing structure has a plurality of queues. The search engine is implemented on the processing circuitry and is configured to search and edit the queues in order to manage traffic at a given instance in time by traversing each of the queues in a predetermined manner to identify and remove eligible entries from each queue that need to be transmitted. The traffic was previously classified as to type and traffic elements were previously loaded onto selected queues with other traffic elements having a similar traffic type.
According to another aspect, a method is provided for prioritizing and queuing traffic from a plurality of data streams. The method includes: providing a queue structure including a plurality of queues; loading traffic onto a selected queue with other traffic having a similar traffic type; and traversing each of the queues in a rule designated order to identify and remove eligible entries from each queue that need to be transmitted.
According to yet another aspect, a system is provided for retrieving traffic elements from priority queues. The system includes queuing means, processing means, and searching means. The queuing means are provided for storing traffic elements in a plurality of queues. The searching means are provided for searching and editing the queues in order to manage traffic at a given instance in time by traversing each of the queues in a designated order to identify and remove eligible entries from each queue that need to be transmitted. The traffic was previously classified as to type and traffic elements or previously loaded onto selected queues with other traffic elements having a similar traffic type.
Preferred embodiments of the invention are described below with reference to the following accompanying drawings.
This disclosure of the invention is submitted in furtherance of the constitutional purposes of the U.S. Patent Laws “to promote the progress of science and useful arts” (Article 1, Section 8).
Reference will now be made to a preferred embodiment of Applicant's invention. An exemplary implementation is described below and depicted with reference to the drawings comprising a traffic prioritizing and queuing system identified by reference numeral 88 (see
In an effort to prevent obscuring the invention at hand, only details germane to implementing the invention will be described in great detail, with presently understood peripheral details being incorporated by reference, as needed, as being presently understood in the art.
Both stages 22 and 24 are linked-list based. The stage 22 includes linked lists 32 (see
The second stage 24 absorbs the potential bursts from the shaping queuing stage 22, which occur when multiple shaping queues 33 become eligible to send traffic within the same relatively small interval. This shaped traffic is then placed on the queue of the appropriate priority in stage 24.
The engine 30 drains traffic from the priority queues 35 in a starve-mode fashion, always servicing the highest priority queue that has traffic to send. The queues 35 of the second stage 24 are relatively small, as the maximum depth allowed equals the number of shaping queues 33 present in the first stage 22. This allows the first stage 22 to be an efficient buffer, in that if there is traffic on other shaping queues 33 of higher priority, it blocks the lower priority traffic, and therefore no large overhead from a traditional arbitration mechanism such as a content addressable memory. (In a connectionist system, data is stored in the activation pattern of the units—if a processing unit receives excitatory input from one of its connections, each of its other connections will either be excited or inhibited. If these connections represent the attributes of the data then the data may be recalled by any one of its attributes, not just those that are part of an indexing system. Because these connections represent the content of the data, this type of memory is called content addressable memory).
No longer must a costly engine be implemented which looks at all of the entries ready to send to pick the best one. The very nature of hierarchical queuing 33 and 35 is the self-ordering/arbitration of traffic when instantaneous congestion occurs.
Pointers and linked lists are known in the computer arts. A pointer is a variable that points to another variable by holding a memory address. A pointer does not hold a value but instead holds the address of another variable. A pointer points to the other variable by holding a copy of the other variable's address. A read/write pointer keeps track of a position within a file from which data can be read or written to. A linked list is a chain of records called nodes. Each node has at least two members, one of which points to the next item or node in the list. The first node is the head, and the last node is the tail. Pointers are used to arrange items in a linked list, as illustrated in
The number of queues 33 and 35 in this architecture (
The shaping engine 28 en-queues incoming traffic 74 (see
This shaping queue can have a shaping profile, which includes properties such as: priority, depth, latency, jitter, and rate. For example, video needs to always get through. A large amount of latency is not desirable for video, as any latency will cause the resulting picture to become jerky, and fall behind. The same is true of the rate at which video is sent. A constant, consistent stream should be used to supply the video information “just in time” for the next entry (e.g., frame) of the picture on a TV or computer. Therefore, “video” traffic is properly classified so that it is managed appropriately. Because the video must always get through, it is given a “high” priority. Because video cannot be influenced/slowed-down with a large amount of latency, the depth of the queue is selected to be shallow. Therefore, little data can build up, waiting in the queue. With regard to rate, the video queue gets its own bandwidth end-to-end on a switch, and doesn't have to compete with any other queue for bandwidth. Queues for other classifications of traffic would similarly have appropriately chosen priorities, depths, latencies, jitter, and rates.
The rate-algorithm for the shaping queues 33 is a table based credit allocation scheme. A fixed size bandwidth allocation table 76 is traversed at a constant rate, with each location (e.g., row) 78–85 (
Based upon the needs of the design in which this queuing structure is implemented, the size of the table 76 can be adjusted to provide the desired minimum and maximum achievable rates. The minimum rate is defined by one credit divided by the table traversal time, and the maximum rate is defined by the maximum number of entries allowed in the table, each containing the maximum number of credits, divided by the table traversal time. The maximum number of entries allowed in the table is dictated by the implementation. For example, the maximum number of entries allowed in the table is determined by the overall “profile” of the port(s) supported by this queuing structure, etc. More particularly, the maximum number of entries allowed in the table is determined by the circuitry or software (e.g., a state-machine) that manages traversing the table relative to the number of queues in the implementation, and how it manages updating the credit for each queue.
When the traffic shaping queuing stage 22 (of
The second stage 24 is a set of priority-based queues 35. Each time an entry is sent to the second stage 24 from the first stage 22, it is accompanied by information indicating the priority of the shaping queue from which it came. This priority is used to determine on which of the priority queues 35 to place the entry. Because a queue from the traffic shaping queuing stage 22 can have only one entry at a time in the priority queues 35, the total space required for this set of priority queuing linked-lists 34 is based on the number of shaping queues in existence.
The second stage 24 uses a burst management engine in the form of an add/delete search engine 30 to traverse the priority queues 35 in a starve-mode, such that the one with the highest priority will be serviced first, sending any entries it may have prior to doing the same for lesser priority queues. This second stage 24 is advantageous because the first stage 22 may have more than one queue become eligible to send an entry at relatively the same time. In fact, all shaping queues 33 could potentially become eligible at relatively the same time. It is when this occurs that the value of the second stage 24 becomes apparent, as it buffers up all of these eligible entries, and then doles them out over time (highest priority first) based upon the throughput available for the port or ports 26 supported by the queues 35. This simplifies the searching needed, and allows for an infinite number of queues 33 and 35 to be managed, by dividing the problem into two simpler steps: earning bandwidth, followed by transmission arbitration. This eliminates the need for expensive caching and/or fairness algorithms.
According to one implementation, a starve mode servicing algorithm is used to manage the second queuing stage. However, other traditional servicing algorithms can be used as well, such as weighted round robin, and pure round robin. The choice of algorithm is dependent upon the implementation needs of the design at hand. The one implementation uses starve mode, because it provides the most useful form of priority-based precedence-ordering of traffic in a congested situation.
The resulting desired shaping behavior is depicted in
Problems solved by this implementation feature include management of the shaping and crediting of a large number of queues by a central “shaping” engine. Another problem solved by this implementation is management in the form of arbitration between a large number of queues all vying to send traffic at the same instant in time, using a central “arbitration” mechanism. This provides a solution that is scalable, providing the ability to shape traffic for a variety of implementations in a cost effective manner; i.e., in a smaller feasible design.
This solution provides a centralized queuing structure, capable of supporting one or more ports, with a high queue density count. This centralized queuing structure is capable of dynamically supporting different ports over time, rather than a fixed set of queues only able to support a single port or ports. The design of the above-described implementation is' also scalable. The design of this implementation, by its very nature, can be implemented for one queue up to the feasible limits of today's technology, without significantly increasing the size of the central engine. The only increase to cost of increasing size is the space needed for the linked-list management. Further, the design of the implementation by its very nature can be implemented to support an infinite variety of min/max rate relationships. Previous implementations could only perform gross granularity transitions for various desired rates.
The preferred environment is all of Ethernet. Slight modification to “shaping” profiles would allow for use in any communication technology including, for example, ATM and SONET.
In one embodiment, the first and second queuing stages are defined together on a single ASIC, which provides for sufficient clock-speed to support Gigabit Ethernet rates.
Having a two-stage structure provides efficiency and performance advantages over a traditional queue-arbitration mechanism. No longer is a massive arbiter or caching engine needed to manage choosing which traffic to send from a plurality of queues when instantaneous congestion occurs across those queues.
Various alternative embodiments are possible. For example, one alternative embodiment has a reduced or increased number of shaping queues. Another alternative embodiment has a reduced or increased number of priorities. The two stage design can be implemented in a per port fashion instead of in a central queuing system.
Usage of multiple linked-lists 98 enables an increase in efficiency for search engine 30 by allowing search engine 30 to concentrate on the most pressing traffic available at that instant in time. In order to do this, the traffic is first classified by priority. Then, the traffic is loaded onto the same queue, or linked-list, with other traffic, or streams, of the same priority. Search engine 30 traverses each linked list in a starve-mode fashion, which means that search engine 30 always starts a search with the linked-list with the highest priority traffic that is not empty. The search engine searches for eligible entries throughout the entire linked-list before moving on to the next highest priority queue. This particular application is an example of backpressure, being port or destination-based, and is one of many types of backpressure (or rules) that can be applied. However, other applications of rules are possible.
More particularly, other alternative applications of rules suitable for use with search engine 30 (of
As was previously shown in
Further to the description of
Search engine 30 (see
The congestion handling queuing stage 24 (of
As shown in
From
Pursuant to
By way of example, backpressure is received at circuitry within the traffic prioritizing and queuing system from downstream points in a network, which have communicated back to the search engine (or circuit) that they do not wish to receive a certain type of traffic until they further notify the system. The search engine then searches each priority queue, looking for traffic elements to send, which are not of the identified “type”, as requested by the downstream logic. Eventually, the downstream network will notify the circuit that it is now okay to send the previously blocked “type” of traffic, at which time that type becomes eligible again, and is now considered for being sent. During the time that some “type” (or “types”) of traffic were ineligible to be sent, the circuit sent other types, allowing the network to still maintain a high utilization rate, rather than blocking eligible traffic by waiting for blocked traffic sitting in front of it in the queues to become eligible.
As previously discussed, traffic types can be identified in any of a number of ways, such as by size, destination, and even priority. If identified by priority, an entire priority queue would be blocked, i.e., marked ineligible. It is understood that the present system does not reclassify any traffic elements. Instead, the system acts upon pre-existing properties of these traffic elements in order to prioritize and queue traffic elements to realize a “best-fit” traffic configuration that enables a higher utilization of the ports served by the queuing structure. In this manner, head-of-line blocking can be avoided, thereby allowing for better profiling, shaping, and servicing of traffic.
For clarification purposes, ability to act on pre-existing properties differs from classification which is entirely different and deals with original forwarding decisions. In contrast, the present system deals with a state of the network at a present moment in time that is used to select what to send next. In this case, the forwarding decision for a traffic element has already been made, and the traffic element is simply waiting to be sent.
In compliance with the statute, the invention has been described in language more or less specific as to structural and methodical features. It is to be understood, however, that the invention is not limited to the specific features shown and described, since the means herein disclosed comprise preferred forms of putting the invention into effect. The invention is, therefore, claimed in any of its forms or modifications within the proper scope of the appended claims appropriately interpreted in accordance with the doctrine of equivalents.
Number | Name | Date | Kind |
---|---|---|---|
5164938 | Jurkevich et al. | Nov 1992 | A |
5483526 | Ben-Nun et al. | Jan 1996 | A |
5633867 | Ben-Nun et al. | May 1997 | A |
5748629 | Caldara et al. | May 1998 | A |
5758137 | Armstrong, Jr. et al. | May 1998 | A |
5872769 | Caldara et al. | Feb 1999 | A |
5926459 | Lyles et al. | Jul 1999 | A |
5953318 | Nattkemper et al. | Sep 1999 | A |
5999518 | Nattkemper et al. | Dec 1999 | A |
6021132 | Muller et al. | Feb 2000 | A |
6038217 | Lyles | Mar 2000 | A |
6052375 | Bass et al. | Apr 2000 | A |
6064650 | Kappler et al. | May 2000 | A |
6064651 | Rogers et al. | May 2000 | A |
6064677 | Kappler et al. | May 2000 | A |
6067298 | Shinohara | May 2000 | A |
6084856 | Simmons et al. | Jul 2000 | A |
6092076 | McDonough et al. | Jul 2000 | A |
6101420 | VanDoren et al. | Aug 2000 | A |
6154816 | Steely et al. | Nov 2000 | A |
6167054 | Simmons et al. | Dec 2000 | A |
6167445 | Gai et al. | Dec 2000 | A |
6195355 | Demizu | Feb 2001 | B1 |
6205118 | Rathnavelu | Mar 2001 | B1 |
6259699 | Opalka et al. | Jul 2001 | B1 |
6343081 | Blanc et al. | Jan 2002 | B1 |
6438134 | Chow et al. | Aug 2002 | B1 |
6445707 | Iuoras et al. | Sep 2002 | B1 |
6477144 | Morris et al. | Nov 2002 | B1 |
6487212 | Erimli et al. | Nov 2002 | B1 |
6628652 | Chrin et al. | Sep 2003 | B1 |
6714553 | Poole et al. | Mar 2004 | B1 |
6754206 | Nattkemper et al. | Jun 2004 | B1 |
6950400 | Tran et al. | Sep 2005 | B1 |
6980552 | Belz et al. | Dec 2005 | B1 |
7042841 | Abdelilah et al. | May 2006 | B2 |
7058789 | Henderson et al. | Jun 2006 | B2 |
7072295 | Benson et al. | Jul 2006 | B1 |
20010001608 | Parruck et al. | May 2001 | A1 |
20010055303 | Horton et al. | Dec 2001 | A1 |
20020012340 | Kalkunte et al. | Jan 2002 | A1 |
20020012341 | Battle et al. | Jan 2002 | A1 |
20020071387 | Horiguchi et al. | Jun 2002 | A1 |
20020163935 | Paatela et al. | Nov 2002 | A1 |
20020172273 | Baker et al. | Nov 2002 | A1 |
20020191622 | Zdan | Dec 2002 | A1 |
20030076848 | Bremler-Barr et al. | Apr 2003 | A1 |
20060233156 | Sugai et al. | Oct 2006 | A1 |
Number | Date | Country | |
---|---|---|---|
20040085978 A1 | May 2004 | US |