The present invention is concerned with storage and data communication systems and is more particularly concerned with a scheduler component of a network processor.
Data and storage communication networks are in widespread use. In many data and storage communication networks, data packet switching is employed to route data packets or frames from point to point between source and destination, and network processors are employed to handle transmission of data into and out of data switches.
The network processor 10 includes data flow chips 12 and 14. The first data flow chip 12 is connected to a data switch 15 (shown in phantom) via first switch ports 16, and is connected to a data network 17 (shown in phantom) via first network ports 18. The first data flow chip 12 is positioned on the ingress side of the switch 15 and handles data frames that are inbound to the switch 15.
The second data flow chip 14 is connected to the switch 15 via second switch ports 20 and is connected to the data network 17 via second network ports 22. The second data flow chip 14 is positioned on the egress side of the switch 15 and handles data frames that are outbound from the switch 15.
As shown in
The network processor 10 also includes a first processor chip 28 coupled to the first data flow chip 12. The first processor chip 28 supervises operation of the first data flow chip 12 and may include multiple processors. A second processor chip 30 is coupled to the second data flow chip 14, supervises operation of the second data flow chip 14 and may include multiple processors.
A control signal path 32 couples an output terminal of second data flow chip 14 to an input terminal of first data flow chip 12 (e.g., to allow transmission of data frames therebetween).
The network processor 10 further includes a first scheduler chip 34 coupled to the first data flow chip 12. The first scheduler chip 34 manages the sequence in which inbound data frames are transmitted to the switch 15 via first switch ports 16. A first memory 36 such as a fast SRAM is coupled to the first scheduler chip 34 (e.g., for storing data frame pointers and flow control information as described further below). The first memory 36 may be, for example, a QDR (quad data rate) SRAM.
A second scheduler chip 38 is coupled to the second data flow chip 14. The second scheduler chip 38 manages the sequence in which data frames are output from the second network ports 22 of the second data flow chip 14. Coupled to the second scheduler chip 38 are at least one and possibly two memories (e.g., fast SRAMs 40) for storing data frame pointers and flow control information. The memories 40 may, like the first memory 36, be QDRs. The additional memory 40 on the egress side of the network processor 10 may be needed because of a larger number of flows output through the second network ports 22 than through the first switch ports 16.
Flows with which the incoming data frames are associated are enqueued in a scheduling queue 42 maintained in the first scheduler chip 34. The scheduling queue 42 defines a sequence in which the flows enqueued therein are to be serviced. The particular scheduling queue 42 of interest in connection with the present invention is a weighted fair queue which arbitrates among flows entitled to a “best effort” or “available bandwidth” Quality of Service (QoS).
As shown in
Although not indicated in
The memory 36 associated with the first scheduler chip 34 holds pointers (“frame pointers”) to locations in the first data buffer 24 corresponding to data frames associated with the flows enqueued in the scheduling queue 42. The memory 36 also stores flow control information, such as information indicative of the QoS to which flows are entitled.
When the scheduling queue 42 indicates that a particular flow enqueued therein is the next to be serviced, reference is made to the frame pointer in the memory 36 corresponding to the first pending data frame for the flow in question and the corresponding frame data is transferred from the first data buffer 24 to an output queue 46 associated with the output port 44.
A more detailed representation of the scheduling queue 42 is shown in
More specifically, the queue slot in which a flow is placed upon enqueuing is calculated according to the formula CP+((WF×FS)/SF), where CP is a pointer (“current pointer”) that indicates a current position (the slot currently being serviced) in the scheduling queue 42; WF is a weighting factor associated with the flow to be enqueued, the weighting factor having been determined on the basis of the QoS to which the flow is entitled; FS is the size of the current frame associated with the flow to be enqueued; and SF is a scaling factor chosen to scale the product (WF×FS) so that the resulting quotient falls within the range defined by the scheduling queue 42. (In accordance with conventional practice, the scaling factor SF is conveniently defined as a integral power of 2—i.e., SF=2′, with n being a positive integer—so that scaling the product (WF×FS) is performed by right shifting.) With this known weighted fair queuing technique, the weighting factors assigned to the various flows in accordance with the QoS assigned to each flow govern how close to the current pointer of the queue each flow is enqueued. In addition, flows which exhibit larger frame sizes are enqueued farther from the current pointer of the queue, to prevent such flows from appropriating an undue proportion of the available bandwidth of the queue. Upon enqueuement, data that identifies a flow (the “Flow ID”) is stored in the appropriate queue slot 48.
In some applications, there may be a wide range of data frame sizes associated with the flows, perhaps on the order of about 64 bytes to 64 KB, or three orders of magnitude. It may also be desirable to assign a large range of weighting factors to the flows so that bandwidth can be sold with a great deal of flexibility and precision. In such instances, it is desirable that the scheduling queue in which weighted fair queuing is applied have a large range, where the range of the scheduling queue is defined to be the maximum distance that flow may be placed from the current pointer. As is understood by those who are skilled in the art, the scheduling queue 42 functions as a ring, with the last queue slot (number 511 in the present example) wrapping around to be adjacent to the first queue slot (number 0).
It could be contemplated to increase the range of the scheduling queue 42 by increasing the number of slots. However, this has disadvantages in terms of increased chip area, greater manufacturing cost and power consumption, and increased queue searching time. Alternatively, the resolution of the scheduling queue 42 could be decreased to increase the range, where “resolution” is understood to mean the inverse of the distance increment that corresponds to each slot in the queue. However, if resolution is decreased, flows that should be assigned different priorities according to their respective QoS may appear to “tie” by being assigned to the same slot, thereby improperly being assigned essentially equal service priority. Thus conventional practice in operating a weighted fair queue of the type described herein requires a tradeoff between resolution and range. In the particular example of a weighted fair queue as shown herein, having 512 slots, a distance increment of one distance unit may be assigned to each slot, in which case the queue has a range of 512. It would be desirable to increase the effective range of such a scheduling queue, without increasing the number of slots or decreasing the effective resolution.
According to a first aspect of the invention, a scheduler for a network processor is provided. The scheduler includes one or more scheduling queues each for defining a sequence in which flows are to be serviced. At least one scheduling queue includes at least a first subqueue and a second subqueue. The first subqueue has a first range and a first resolution, and the second subqueue has a second range that is greater than the first range and a second resolution that is less than the first resolution.
The first and second subqueues may, but need not, have equal numbers of slots. In one embodiment, the range of the second subqueue may be sixteen times the range of the first subqueue, and the resolution of the second subqueue may be one-sixteenth of the resolution of the first subqueue. Other relationships between range and/or resolution may be employed. For example, the range of the second subqueue may be larger than the range of the first subqueue by any amount, and the resolution of the second subqueue may be less than the resolution of the first subqueue by any amount (e.g., regardless of the amount by which the range of the second subqueue exceeds the range of the first subqueue). However, in embodiments of the invention wherein the first and the second subqueues have the same number of slots, maintaining a direct inverse relationship between the resolution and the range of the second subqueue allows for an “effective” increase in scheduling queue range without an accompanying increase in consumed chip area.
According to another aspect of the invention, a scheduler for a network processor is provided. The scheduler includes a scheduling queue in which flows are enqueued according to the formula CP+((WF×FS)/SF). In this formula CP is a pointer for indicating a current position in the scheduling queue, WF is a weighting factor associated with a flow appointed for enqueuing, FS is a frame size associated with the flow appointed for enqueuing, and SF is a scaling factor. The scheduling queue includes at least a first subqueue and a second subqueue. The flow appointed for enqueuing is enqueued to the first subqueue if the value of the expression ((WF×FS)/SF) is less than a range of the first subqueue. The flow appointed for enqueuing is enqueued to the second subqueue if the value of the expression ((WF×FS)/SF) is greater than a range of the first subqueue.
Still another aspect of the invention provides for a method of dequeuing a flow from a scheduling queue in a scheduler for a network processor. The method includes searching a first subqueue of the scheduling queue to find a first winning flow in the first subqueue, and determining a first queue distance corresponding to a distance between a current pointer and a slot in which the first winning flow is enqueued. The method further includes searching a second subqueue of the scheduling queue to find a second winning flow in the second subqueue, and determining a second queue distance corresponding to a distance between the current pointer and a slot in which the second winning flow is enqueued. The method further includes comparing the first and second queue distances, and selecting for dequeuing one of the first and second winning flows based on a result of the comparing step.
By configuring a scheduling queue as two or more subqueues having different ranges and resolutions, the present invention allows the scheduling queue to offer an enhanced range, without adding to the total number of queue slots, while providing much of the benefit of high resolution.
Other objects, features and advantages of the present invention will become more fully apparent from the following detailed description of exemplary embodiments, the appended claims and the accompanying drawings.
Exemplary embodiments of the invention will now be described with reference to
It will be noted that the total number of slots of the two subqueues 52 and 54 is 512, which is the same as the number of slots of the prior art scheduling queue 42 shown in
Although the scheduling queue 50 is shown to have only two subqueues in the example of
The present invention also contemplates that a single pointer may be shared by two or more subqueues to indicate the current positions in the subqueues. In the example illustrated in
It is then determined, at block 64, whether the resulting value is less than the range of the higher resolution subqueue 52. If that is the case, then, as indicated by block 66, the flow is attached to the higher resolution subqueue 52 at the indicated slot (i.e., at the indicated distance from the slot number indicated by the 8 least significant bits of the 12-bit current pointer).
However, if at block 64 it is found that the enqueuement distance is greater than or equal to the range of the higher resolution subqueue 52, then block 67 follows block 64. At block 67 the enqueuement distance is divided by 16 (e.g., the enqueuement distance is “scaled”) to reflect the decreased resolution of the lower resolution subqueue 54. Next, at block 68, the flow is attached to the lower resolution subqueue 54 at a slot represented by the sum (e.g., using modulo 256 in an embodiment in which the lower resolution subqueue 54 employs 256 slots) of the scaled enqueuement distance plus the value indicated by the 8 most significant bits of the 12-bit current pointer, plus 1. The addition of 1 to the sum of the scaled enqueuement distance and the current pointer value is performed to take into account a preference that, as will be seen, is accorded to the lower resolution subqueue 54 during dequeuement in case of a tie with the higher resolution subqueue 52.
If there is no tie, then block 78 follows block 74. At block 78 the winning flow is detached and serviced. (It will be understood that the “winning flow” is the flow that is closest to the head of the queue.)
Following block 76 or 78, as the case may be, is block 80, at which the position of the current pointer is updated to the position of the slot from which the winning flow was detached. (It should be noted that the process described in connection with
The processes of
The invention also reduces the search time required to examine the scheduling queue for entries, and is scalable to larger designs. Both subqueues can be searched at the same time since they are physically separate and are controlled through the same current pointer. Accordingly, adding more subqueues will lead to better performance when searching for entries in the scheduling queue because there is more searching in parallel.
The foregoing description discloses only exemplary embodiments of the invention; modifications of the above disclosed apparatus and methods which fall within the scope of the invention will be readily apparent to those of ordinary skill in the art. For example, although in the above description the ratio of the respective ranges and resolutions of the subqueues was selected to be 16, other ratios may be selected, including 2, 4, 8, 32 or 64. It also is not necessary that the ratio be a power of 2. That is, no specific relationship between subqueue resolution and range is required, and a resolution of a subqueue may be decreased independently of an amount by which a range of the subqueue is increased. As noted before, more than two subqueues having different ranges and resolutions may be included in the inventive scheduling queue.
Moreover, in the above description, the inventive scheduling queue has been implemented in a separate scheduler chip associated with a network processor. However, it is also contemplated to implement the inventive scheduling queue in a scheduler circuit that is implemented as part of a data flow chip or as part of a processor chip in a network processor.
Accordingly, while the present invention has been disclosed in connection with exemplary embodiments thereof, it should be understood that other embodiments may fall within the spirit and scope of the invention as defined by the following claims.
The present application is a continuation of and claims priority from U.S. patent application Ser. No. 10/016,518, filed Nov. 1, 2001 which is hereby incorporated by reference herein in its entirety. The present application is related to the following U.S. patent applications, each of which is hereby incorporated by reference herein in its entirety: U.S. patent application Ser. No. 10/015,994, filed Nov. 1, 2001, titled “WEIGHTED FAIR QUEUE SERVING PLURAL OUTPUT PORTS” (IBM Docket No. ROC920010200US1); U.S. patent application Ser. No. 10/015,760, filed Nov. 1, 2001, titled “WEIGHTED FAIR QUEUE HAVING ADJUSTABLE SCALING FACTOR” (IBM Docket No. ROC920010201US1); U.S. patent application Ser. No. 10/002,085, filed Nov. 1, 2001, titled “EMPTY INDICATORS FOR WEIGHTED FAIR QUEUES” (IBM Docket No. ROC920010202US1); U.S. patent application Ser. No. 10/004,373, filed Nov. 1, 2001, titled “QoS SCHEDULER AND METHOD FOR IMPLEMENTING PEAK SERVICE DISTANCE USING NEXT PEAK SERVICE TIME VIOLATED INDICATION” (IBM Docket No. ROC920010203US1); U.S. patent application Ser. No. 10/002,416, filed Nov. 1, 2001, titled “QoS SCHEDULER AND METHOD FOR IMPLEMENTING QUALITY OF SERVICE WITH AGING STAMPS” (IBM Docket No. ROC920010204US1); U.S. patent application Ser. No. 10/004,440, filed Nov. 1, 2001, titled “QoS SCHEDULER AND METHOD FOR IMPLEMENTING QUALITY OF SERVICE WITH CACHED STATUS ARRAY” (IBM Docket No. ROC920010205US1); and U.S. patent application Ser. No. 10/004,217, filed Nov. 1, 2001, titled “QoS SCHEDULER AND METHOD FOR IMPLEMENTING QUALITY OF SERVICE ANTICIPATING THE END OF A CHAIN OF FLOWS” (IBM Docket No. ROC920010206US1).
Number | Date | Country | |
---|---|---|---|
Parent | 10016518 | Nov 2001 | US |
Child | 11679812 | Feb 2007 | US |