The present invention is concerned with data and storage communication systems and is more particularly concerned with a scheduler component of a network processor.
Data and storage communication networks are in widespread use. In many data and storage communication networks, data packet switching is employed to route data packets or frames from point to point between source and destination, and network processors are employed to handle transmission of data into and out of data switches.
The network processor 10 includes data flow chips 12 and 14. The first data flow chip 12 is connected to a data switch 15 (shown in phantom) via first switch ports 16, and is connected to a data network 17 (shown in phantom) via first network ports 18. The first data flow chip 12 is positioned on the ingress side of the switch 15 and handles data frames that are inbound to the switch 15.
The second data flow chip 14 is connected to the switch 15 via second switch ports 20 and is connected to the data network 17 via second network ports 22. The second data flow chip 14 is positioned on the egress side of the switch 15 and handles data frames that are outbound from the switch 15.
As shown in
The network processor 10 also includes a first processor chip 28 coupled to the first data flow chip 12. The first processor chip 28 supervises operation of the first data flow chip 12 and may include multiple processors. A second processor chip 30 is coupled to the second data flow chip 14, supervises operation of the second data flow chip 14 and may include multiple processors.
A control signal path 32 couples an output terminal of second data flow chip 14 to an input terminal of first data flow chip 12 (e.g., to allow transmission of data frames therebetween).
The network processor 10 further includes a first scheduler chip 34 coupled to the first data flow chip 12. The first scheduler chip 34 manages the sequence in which inbound data frames are transmitted to the switch 15 via first switch ports 16. A first memory 36 such as a fast SRAM is coupled to the first scheduler chip 34 (e.g., for storing data frame pointers and flow control information as described further below). The first memory 36 may be, for example, a QDR (quad data rate) SRAM.
A second scheduler chip 38 is coupled to the second data flow chip 14. The second scheduler chip 38 manages the sequence in which data frames are output from the second network ports 22 of the second data flow chip 14. Coupled to the second scheduler chip 38 are at least one and possibly two memories (e.g., fast SRAMs 40) for storing data frame pointers and flow control information. The memories 40 may, like the first memory 36, be QDRs. The additional memory 40 on the egress side of the network processor 10 may be needed because of a larger number of flows output through the second network ports 22 than through the first switch ports 16.
Flows with which the incoming data frames are associated are enqueued in a scheduling queue 42 maintained in the first scheduler chip 34. The scheduling queue 42 defines a sequence in which the flows enqueued therein are to be serviced. The particular scheduling queue 42 of interest in connection with the present invention is a weighted fair queue which arbitrates among flows entitled to a “best effort” or “available bandwidth” Quality of Service (QoS).
As shown in
Although not indicated in
The memory 36 associated with the first scheduler chip 34 holds pointers (“frame pointers”) to locations in the first data buffer 24 corresponding to data frames associated with the flows enqueued in the scheduling queue 42. The memory 36 also stores flow control information, such as information indicative of the QoS to which flows are entitled.
When the scheduling queue 42 indicates that a particular flow enqueued therein is the next to be serviced, reference is made to the frame pointer in the memory 36 corresponding to the first pending data frame for the flow in question and the corresponding frame data is transferred from the first data buffer 24 to an output queue 46 associated with the output port 44.
A more detailed representation of the scheduling queue 42 is shown in
More specifically, the queue slot in which a flow is placed upon enqueuing is calculated according to the formula CP+((WF×FS)/SF), where CP is a pointer (“current pointer”) that indicates a current position (the slot currently being serviced) in the scheduling queue 42; WF is a weighting factor associated with the flow to be enqueued, the weighting factor having been determined on the basis of the QoS to which the flow is entitled; FS is the size of the current frame associated with the flow to be enqueued; and SF is a scaling factor chosen to scale the product (WF×FS) so that the resulting quotient falls within the range defined by the scheduling queue 42. (In accordance with conventional practice, the scaling factor SF is conveniently defined as a integral power of 2—i.e., SF=2n, with n being a positive integer—so that scaling the product (WF×FS) is performed by right shifting.) With this known weighted fair queuing technique, the weighting factors assigned to the various flows in accordance with the QoS assigned to each flow govern how close to the current pointer of the queue each flow is enqueued. In addition, flows which exhibit larger frame sizes are enqueued farther from the current pointer of the queue, to prevent such flows from appropriating an undue proportion of the available bandwidth of the queue. Upon enqueuement, data that identifies a flow (the “Flow ID”) is stored in the appropriate queue slot 48.
In some applications, there may be a wide range of data frame sizes associated with the flows, perhaps on the order of about 64 bytes to 64 KB, or three orders of magnitude. It may also be desirable to assign a large range of weighting factors to the flows so that bandwidth can be sold with a great deal of flexibility and precision. In practice, however, it is difficult to predict at the time of designing or initializing the scheduler chip 34 what will be the characteristics of the data packets handled by the scheduler chip 34. Consequently, it is difficult to anticipate over what range of values the product (WF×FS) will fall during operation of the network processor 10. As a result, the scaling factor SF may be chosen to be a value that is too large or too small. If the value of SF is chosen to be too small, then the enqueuement distance D=(WF×FS)/SF) may overrun the range R of the scheduling queue 42. If this occurs, an error condition may result, or the enqueuement distance D may be reduced to equal the range R of the scheduling queue 42, resulting in a failure to properly perform the desired weighted fair queuing.
If the scaling factor SF is chosen to be too large, then all of the flows to be enqueued may be attached relatively close to the current pointer of the scheduling queue 42. As a result, the full resources of the range of the scheduling queue 42 may not be used, again possibly resulting in a failure to precisely perform the desired weighted fair queuing.
It would accordingly be desirable to overcome the potential drawbacks of setting the scaling factor SF either too low or too high.
According to an aspect of the invention, a scheduler for a network processor is provided. The scheduler includes a scheduling queue in which weighted fair queuing is applied. The scheduling queue has a range R. Flows are attached to the scheduling queue at a distance D from a current pointer for the scheduling queue. The distance D is calculated for each flow according to the formula D=((WF×FS)/SF), where WF is a weighting factor applicable to a respective flow; FS is a frame size attributable to the respective flow; and SF is a scaling factor. The scaling factor SF is adjusted depending on a result of comparing the distance D to the range R.
In at least one embodiment, the scaling factor SF may be increased if D is greater than R. For example, the scaling factor SF may be increased if D exceeds R in regard to a predetermined number of calculations of D.
In one or more embodiments, the scaling factor SF may be decreased if D is less than R/2. For example, the scaling factor SF may be decreased if D is less than one-half R in regard to a predetermined number of calculations of D.
In some embodiments, the scaling factor SF may equal 2n, where n is a positive integer. For example, n may be incremented to increase SF, or may be decremented to decrease SF.
According to another aspect of the invention, a method of managing a scheduling queue in a scheduler for a network processor is provided. The scheduling queue has a range R. Flows are attached to the scheduling queue at a distance D from a current pointer for the scheduling queue, the distance D being calculated for each flow according to the formula D=((WF×FS)/SF), where WF is a weighting factor applicable to a respective flow, FS is a frame size attributable to the respective flow, and SF is a scaling factor. The method includes calculating the distance D with respect to a particular flow to be enqueued, comparing the distance D to the range R, and adjusting the scaling factor SF based on a result of the comparing step.
In a scheduler provided in accordance with the invention, an initial value at which the scaling factor SF is set may be adjusted adaptively during operation of the scheduler to reflect actual experience with data handled by the scheduler, so that the scaling factor SF assumes a value that is suitable for using the range R of the scheduling queue and/or such that the enqueuement distance D does not overrun the range R of the scheduling queue.
Other objects, features and advantages of the present invention will become more fully apparent from the following detailed description of exemplary embodiments, the appended claims and the accompanying drawings.
Adjustment of a scaling factor SF of a scheduler in accordance with the invention will now be described, initially with reference to
Initially in
Following block 50 is decision block 52. In decision block 52 it is determined whether the enqueuement distance D exceeded (overran) the range R of the scheduling queue 42. If not, the procedure of
However, if it is determined at decision block 52 that the enqueuement distance D overran the range R of the scheduling queue 42, then block 56 follows decision block 52. At block 56, a value of the counter C0 (
Following block 56 is decision block 58. At decision block 58, it is determined whether the incremented counter value exceeds a predetermined threshold. This threshold (and other thresholds discussed below) can be set in a variety of ways. For example, the threshold can be determined by software if the software has information concerning the flows/frames to be handled. If so, the scaling factor SF can be set accurately based on the flows/frames that are expected. The software would then set the threshold to handle flows that misbehave. For example, if it is not desired to tolerate an occasional frame that causes the enqueuement distance D to exceed the range R, then the threshold may be set to zero. If system requirements allow some misbehaving flows to be tolerated, then the threshold may be set higher.
If the software has no information concerning the flows/frames that to be handled, then an arbitrary value for the initial value of the scaling factor SF can be chosen, and the threshold can be set so that the scaling factor SF is increased rapidly if the range R of the scheduling queue 42 is exceeded. (A threshold for decreasing the scaling factor SF, to be discussed below, may be set so that the scaling factor SF is decreased slowly if the flows are all being scheduled in the lower part of the scheduling queue 42.) These threshold values would allow the system to quickly adapt to unknown input.
If a positive determination is made at decision block 58, the procedure returns (block 54). However, if it is determined at decision block 58 that the predetermined threshold is exceeded by the incremented counter value, then block 60 follows decision block 58.
At block 60 the value of the scaling factor SF is increased. This may be done in a number of ways. For example, if the scaling factor SF is expressed as an integral power of 2 (i.e., 2n), then the scaling factor SF may be doubled by incrementing the value of n (e.g., via a left shifting operation as previously described, such as left shifting a register (not shown) in which the scaling factor is stored). It is contemplated, alternatively, to increase SF by a factor other than two.
Following block 60 is block 62 at which the counter C0 is reset. The procedure of
It will be appreciated that the procedure of
The procedure of
At block 74 a value of the counter C0 is incremented. Following block 74 is decision block 76 at which it is determined whether the incremented counter value is greater than a predetermined threshold. If not, the procedure of
At block 80 the value of the scaling factor SF is decreased. The decreasing of the value of the scaling factor SF may occur in a number of ways. For example, if the scaling factor SF is expressed as a power of 2 (i.e., 2n) then the scaling factor SF may be halved by decrementing n (e.g., by right shifting a register (not shown) in which the scaling factor is stored). It is contemplated, alternatively, to decrease the scaling factor SF by a factor other than two.
Following block 80 is block 82, at which the counter C0 is reset. The procedure of
Considering again decision block 72, if it is determined at that decision block that the enqueuement distance D is not less than one-half the range R of the scheduling queue 42, then block 84 follows decision block 72. At block 84 the counter C0 is reset, and the procedure of
With the procedure of
Initially in the procedure of
Following block 100 is block 102. At block 102 the first counter C1 is reset. The second counter C2 (
Considering again decision block 92, if it is determined at decision block 92 that the enqueuement distance D is not greater than the range R of the scheduling queue 42, then decision block 104 (
At block 106, the value of the second counter C2 is incremented. Following block 106 is decision block 108, at which it is determined whether the value of the second counter C2 is greater than a second threshold. If not, the procedure returns (block 98). However, if it is determined at decision block 108 that the value of the second counter C2 is greater than the second threshold, then block 110 follows decision block 108. At decision block 110 the value of the scaling factor SF is decreased. This may be done, for example, by decrementing n where SF is expressed as 2n, or by any other technique.
Following block 110 is block 112. At block 112 the first and second counters C1, C2 are reset. The procedure then returns (block 98).
Considering again decision block 104, if it is determined at decision block 104 that the enqueuement distance D is not less than one-half the range R of the scheduling queue 42, then block 114 follows decision block 104. At block 114 the second counter C2 is reset. The procedure of
In one embodiment of the procedure of
In the procedure of
A scheduler configured in accordance with the present invention can also adapt to changes in a stream of data by increasing or decreasing the scaling factor SF as the situation requires. Thus the scheduler may, for example, increase the scaling factor SF during an initial period of operation, then may decrease the scaling factor SF in response to a change in the pattern of data traffic, and further may increase the scaling factor SF again in response to another change in the pattern of data traffic.
Noting again that plural scheduling queues (e.g., 64) may be maintained in the inventive scheduler, it should be understood that respective scaling factors SF of the scheduling queues are advantageously to be adjusted independently of one another. Consequently, in a typical situation in accordance with the invention, different values of scaling factors are applicable to different scheduling queues at any given time.
The processes of
The foregoing description discloses only exemplary embodiments of the invention; modifications of the above disclosed apparatus and methods which fall within the scope of the invention will be readily apparent to those of ordinary skill in the art. According to one alternative embodiment, a scheduling queue may have plural subqueues of different ranges and resolutions, according to an invention disclosed in co-pending patent application Ser. No. 10/016,518, filed Nov. 1, 2001 (Attorney Docket No. ROC920010199US1). This co-pending patent application is incorporated herein by reference.
Moreover, in the above description, the invention has been implemented in a separate scheduler chip associated with a network processor. However, it is also contemplated to implement the invention in a scheduler circuit that is implemented as part of a data flow chip or as part of a processor chip.
Furthermore, in accordance with above-disclosed embodiments of the invention, reduction of the scaling factor SF has been triggered by underutilization of the range of the scheduling queue, where underutilization has been effectively defined as attaching flows repeatedly in the lower half of the scheduling queue. It is alternatively contemplated, however, to define underutilization of the range of the scheduling queue in other ways. For example, underutilization may be deemed to have occurred upon repeated attachment of flows in the lower third or lower quarter of the scheduling queue.
Accordingly, while the present invention has been disclosed in connection with exemplary embodiments thereof, it should be understood that other embodiments may fall within the spirit and scope of the invention, as defined by the following claims.
The present application is a continuation of and claims priority to U.S. patent application Ser. No. 10/015,760, filed Nov. 1, 2001, titled “WEIGHTED FAIR QUEUE HAVING ADJUSTABLE SCALING FACTOR,” which is hereby incorporated by reference herein in its entirety. The present application is related to the following U.S. patent applications, each of which is hereby incorporated by reference herein in its entirety: U.S. patent application Ser. No. 10/016,518, filed Nov. 1, 2001, titled “WEIGHTED FAIR QUEUE HAVING EXTENDED EFFECTIVE RANGE” (IBM Docket No. ROC920010199US1); U.S. patent application Ser. No. 10/015,994, filed Nov. 1, 2001, titled “WEIGHTED FAIR QUEUE SERVING PLURAL OUTPUT PORTS” (IBM Docket No. ROC920010200US1); U.S. patent application Ser. No. 10/002,085, filed Nov. 1, 2001, titled “EMPTY INDICATORS FOR WEIGHTED FAIR QUEUES” (IBM Docket No. ROC920010202US1); U.S. patent application Ser. No. 10/004,373, filed Nov. 1, 2001, titled “QoS SCHEDULER AND METHOD FOR IMPLEMENTING PEAK SERVICE DISTANCE USING NEXT PEAK SERVICE TIME VIOLATED INDICATION” now U.S. Pat. No. 6,973,036 issued Dec. 6, 2005 (IBM Docket No. ROC920010203US1); U.S. patent application Ser. No. 10/002,416, filed Nov. 1, 2001, titled “QoS SCHEDULER AND METHOD FOR IMPLEMENTING QUALITY OF SERVICE WITH AGING STAMPS” now U.S. Pat. No. 7,103,051 issued on Sep. 5, 2006 (IBM Docket No. ROC920010204US1); U.S. patent application Ser. No. 10/004,440, filed Nov. 1, 2001, titled “QoS SCHEDULER AND METHOD FOR IMPLEMENTING QUALITY OF SERVICE WITH CACHED STATUS ARRAY” now U.S. Pat. No. 7,046,676 issued on May 16, 2006 (IBM Docket No. ROC920010205US1); and U.S. patent application Ser. No. 10/004,217, filed Nov. 1, 2001, titled “QoS SCHEDULER AND METHOD FOR IMPLEMENTING QUALITY OF SERVICE ANTICIPATING THE END OF A CHAIN OF FLOWS” now U.S. Pat. No. 6,982,986 issued on Jan. 3, 2006 (IBM Docket No. ROC920010206US1).
Number | Date | Country | |
---|---|---|---|
Parent | 10015760 | Nov 2001 | US |
Child | 11862060 | Sep 2007 | US |