Routers are typically positioned throughout an interconnection network, such as a distributed network of processors. Each router may include arbiters which control the transmission flow of data packets from the router. The arbiters may resolve conflicts, such as multiple packets vying for access to the same resource. Many arbiter designs are “locally fair,” such that each source of packets gets equal access to a resource. However, when these locally fair arbiters are combined with any asymmetry in the overall network, “global unfairness,” such as packet sources being underserved, may arise. Such global unfairness may ultimately lead to loss of network overall performance. The asymmetry that leads to global unfairness may come from the topology of the network, the topology of virtual networks within the network, or the traffic demands on the network.
This technology is directed to age-based arbitration systems and methods.
An aspect of the technology is directed to a method of routing data packets in an interconnection network. The method may include receiving a plurality of data packets at a node in the interconnection network, the node including a set of queues, wherein each received packet of the plurality of data packets includes an age; inputting a set of received data packets of the plurality of data packets to a first queue in the set of queues; and for each data packet in the set of received data packets: replacing the age of the data packet with an injection time corresponding to a local time of the first queue when the data packet was input into the first queue and the age of the data packet; determining an updated age based on the local time of the first queue when the data packet was transmitted from the first queue and the injection time of the data packet; and replacing the injection time of the data packet with the updated age.
Another aspect of the technology is directed to a system of routing data packets in an interconnection network. The system may include a router including at least one arbiter and a set of queues. The router may be configured to: receive a plurality of data packets, wherein each received packet of the plurality of data packets includes an age; input a set of received data packets of the plurality of data packets into to a first queue in the set of queues; and for each data packet in the set of received data packets: replace the age of the data packet with an injection time corresponding to a local time of the first queue when the data packet was input into the first queue and the age of the data packet; determine an updated age based on the local time of the first queue when the data packet was transmitted from the first queue and the injection time of the data packet; and replace the injection time of the data packet with the updated age.
In some instances, the updated age may be bounded to a maximum age value.
In some instances, each data packet may be transmitted from the first queue is sent to an arbiter.
In some instances, an arbiter receives a first data packet transmitted from the first queue and at least one other data packet transmitted from one or more of the other queues of the set of queues. In some examples, the arbiter: determines an oldest data packet from the first data packet and the at least one other data packet; and transmits the oldest data packet.
In some instances, the injection time is calculated by subtracting the age of the data packet from the local time of the first queue when the data packet was input into the first queue.
In some instances, the updated age is determined by subtracting the injection time of the data packet from the local time of the first queue when the data packet was transmitted from the first queue.
In some examples, the updated age is adjusted by an aging rate based on the value of the updated age relative to a maximum age.
In some examples, the aging rate is not adjusted when the updated age is less than half the maximum age.
In some examples, the aging rate is (i) 0.5 when the update age is between 50% and 75% of the maximum age; and (ii) 2−n, when the updated age is between [1-2−n and 1-2−n−1].
The technology described herein is directed to routing data packets in an interconnection network. For example, components, such as processors, on an interconnection network may transmit and receive data packets to and from other components on the interconnection network. To improve the timeliness of the delivery of the data packets across the interconnection network, the data packets may include age data. Routers positioned throughout the interconnection network control the flow of the data packets through the use of aging first-in, first-out (FIFO) queues and age-based arbiters. The age-based arbiters within the routers are configured to prioritize older data packets over newer data packets being pushed from the FIFO queues.
As previously described, locally fair arbiters have shortcomings that result in global unfairness. These shortcomings of locally fair arbiters may be addressed by replacing them with age-based arbiters. Age-based arbiters may use a data packet's “age,” representing the time the data was inserted into the network to prioritize older packets when resolving contentions when multiple data packets are vying for the same resource, such as the same transmission link, same input, etc.
A shortcoming of existing age-based arbiters is that updating each data packet's age requires careful fine tuning of parameters and/or global coordination between routers to avoid ages from becoming too large. And, because each data packet must carry its age through the network, the number of bits used to store this age are usually at a premium. If the age of a data packet overflows, by exceeding the maximum value represented by the bits used to store the age, then a very large age could become very small. For example, age data represented by 8-bits may store age values between 0 and 255. If the age of the data packet exceeds 255, the age of the data packet may reset back to 0, which can lead to unpredictable behavior, including further delays in delivering the data packet.
Further, many system resources may be required to directly monitor the age of all data packets in the interconnection network in order to prevent overflow. This is because packets are typically queued in large memory modules, such as random access memories (RAMs), and the stored data packets cannot be easily modified without significant time delays for reading and rewriting the stored data packets.
To address these issues with known age-based arbiters, aging FIFO queues may be used. Each aging FIFO may be comprised of RAM, such as SRAM, and may contain 10s, 100s, or more data packets. To avoid the issue of needing to read and rewrite data packets continuously to update their respective ages, each data packet inserted into an aging FIFO may have its age converted to an “injection time” relative to a local clock associated with the aging FIFO. The data packet may carry its injection time until it is transmitted from the aging FIFO. At this time the injection time may be converted back to an age, again using the local clock associated with the aging FIFO. By converting the age of the data packet to a time value when added to the aging FIFO, the age of the data packet can again be determined when, or before, it is transmitted from the aging FIFO, thereby avoiding the need to continuously update the age of the data packet as it traverses the FIFO.
Nodes may include any computing resource capable of communicating with other computing resources. For instance, nodes may include computers, servers, mobile devices, processors, cores, routers, memory, cards (e.g., accelerator cards), FIFOs or other such queues, etc. In some instances, nodes may include a collection of computing resources, such as processors with routers and FIFOs, etc.
As further shown in
Although
Each aging FIFO 210, 211 may transmit data packets to an arbiter 220, as illustrated by arrows 214 and 215, respectively. The arbiter 220 may determine the oldest data packets, based on the age of the data packet and transmit the data packet to another node 230, as illustrated by arrow 231.
Routers may be positioned along interconnections and/or within other resources. For instance, router 200 may be positioned within a processor, core, etc. Although router 200 is illustrated with only two aging FIFOs, routers may include any number of FIFOs and arbiters.
As further illustrated in
Data packet 301 is an example data packet, and other data packets having age data may be transmitted over an interconnection network. In this regard, data packets may be of any size, such as 8, 16, 32, 64, 128, etc., bits. Moreover, age information and data information may likewise be represented by any number of bits. For instance, age data may be represented by 8, 16, 32 bits, etc.
Additionally, the position of age data relative to other data within a data packet may be different than shown in
As previously described, the age data of a data packet may be replaced with time data when it is inserted into an aging FIFO. For instance, and as illustrated in
After the data packet with the time data progresses to the last position 419 of the aging FIFO, the data packet 410 may be output by the aging FIFO 411 as illustrated by arrow 451 and input into additional logic 441, as illustrated by arrow 452. Logic 441 may convert the time back to age data representing the current age of the data packet 441 and send the data packet to its next destination, which may be an arbiter, as illustrated by arrow 453. In some instances the time data of the data packet 410 may be changed before it is output by the aging FIFO 411.
Although
As further illustrated in
For example, data packet may have a “push_age” of 2 when it is injected into the FIFO at time T=10 (indicated by the local clock). When the data packet is injected into the FIFO, the push_age of 2 is replaced with push_time having a value of 8: push_time (i.e., injection time)=10(local_time)−2(push_age).
Upon ejection at time T=15, the data packet would have a pop_age of 7: pop_age=15(local_time)−8(injection time).
A property of injection times (i.e., push_times) is that they do not change, even as the packets age. Thus, by converting a data packet's age to an injection time and writing the injection time to the FIFO's RAM instead of the age there is no need to continually update the age of the data packet. When the data packet is read from the RAM, the injection time may be converted back to an age using the current value of the local clock. This conversion from age to time and back to age is all done within the aging FIFO, so the local clock does not require any synchronization with the local clocks of other aging FIFOs within the network. Further, this local clock approach avoids any kind of global synchronization that may be required in other implementations.
A drawback of age-time-age conversion is that it may be difficult to bound the age of a data packet when it is transmitted from the FIFO. As such, the bit widths required for push and pop times may be larger than provided by the data packet. To address this, the push time of a data packet may be modified so it is a bounded amount below the largest push time of data packets that arrived before it to the queue (i.e., aging FIFO). This modification is expected to be minimal and similar to the priority inversion encountered within the FIFO. Age-based arbitration prioritizes older data packets or, equivalently, prioritizes data packets with earlier injection times. If a data packet with an earlier push time arrives at the FIFO after a data packet with a later push time, the earlier data packet will be blocked behind the later packet, essentially inheriting its lower priority. This is the priority inversion.
Moreover, if age-based arbitration is working well, data packets may be expected to arrive at each aging FIFO roughly in order of their push time. The net expectation is that enforcing a bounded-decrease push time should not significantly impact the overall performance of the age-based arbitration scheme.
Upon inputting a data packet into an aging FIFO, a sequence of steps, as illustrated below, may be implemented. Initially, the push time of the data packet may be determined. The push time of the data packet may be determined by calculating the aging FIFO's local_time minus the packet's push_age.
If the aging FIFO is empty or the push_time of the data packet is less than the max_push_time associated with the aging FIFO, then max_push_time is replaced with the value of push_time previously determined.
If the push_time of the data packet is less than max_push_time−T, then the value of push_time is updated to be equal to max_push_time−T, where T is a constant that determines the maximum decrease in the sequence of push times and T>=0.
When implementing the steps described herein, the various times (local_time, push_time, and max_push_time) can all be represented using the same number of bits as the packet ages. M may be considered the bit width of these quantities. In some instances, the M-bit representations can wrap around (i.e., go past the maximum value back to a lower value). In general, this problem may be avoided by comparing differences of times and relying on the properties of the aging FIFO to ensure these differences are always in the range [0, 2{circumflex over ( )}M).
In the next step, the push_time may be compared to the max_push_time. Instead of comparing them directly, the comparison may be transformed such that it is between differences known to be in [0, 2{circumflex over ( )}M):
The left hand side of the final expression (local_time−max_push_time) is the age of the youngest packet already in the aging FIFO. Because the FIFO avoids age overflow, this expression is known to be in [0, 2{circumflex over ( )}M).
In Step 3, the push_time is compared to the max_push_time−T. This comparison can be transformed to:
It follows that the left hand side is in [T, 2{circumflex over ( )}M+T). Given that T<2{circumflex over ( )}M, the final expression can be safely evaluated using M+1 bit arithmetic.
In this formulation, T is a threshold parameter that determines the maximum decrease. To prevent ages from overflowing, the value of the data packet at the head of the FIFO is reviewed and the age of the next entry to be popped (i.e., transmitted) is determined. This age is referred to herein as the pop age.
The local time may then be incremented each cycle that the pop_age plus T is less than the maximum age. By doing such, no age of a queued packet can overflow because the bounded decrease ensures that there is no packet in the queue whose age is more than T greater than the age of the head packet. Thus, all times may be represented as M-bit quantities, ignoring wrapping because the pop ages will be within [0, 2{circumflex over ( )}M].
To further extend the range of representable ages, precision may be reduced, as age increases, through age slowing. In age slowing, the aging rate is reduced as data packets approach the maximum age. For instance, the aging rate of a data packet may be reduced by factors of two depending on the current age. If the age is less than half the maximum, the aging rate matches the previous section. If the age is between half and three quarters of the maximum, the aging rate is halved, and so on as illustrated in table 1:
For an M-bit age field, age slowing increases the effective maximum age to M2M. The aging rate values in Table 1 are merely examples, and other rates may be used.
In some instances, the number of bits needed to store the age of a data packet may be determined using the following equation:
age when leaving network=#queue entries/bandwidth Equation 1:
As shown in block 703, a set of received data packets of the plurality of data packets may be inserted into to a first queue in the set of queues.
As shown in block 705, for each data packet in the set of received data packets:
Unless otherwise stated, the foregoing alternative examples are not mutually exclusive, but may be implemented in various combinations to achieve unique advantages. As these and other variations and combinations of the features discussed above can be utilized without departing from the subject matter defined by the claims, the foregoing description of the embodiments should be taken by way of illustration rather than by way of limitation of the subject matter defined by the claims. In addition, the provision of the examples described herein, as well as clauses phrased as “such as,” “including,” and the like, should not be interpreted as limiting the subject matter of the claims to the specific examples; rather, the examples are intended to illustrate only one of many possible embodiments. Further, the same reference numbers in different drawings can identify the same or similar elements.
This application claims the benefit of the filing date of U.S. Provisional Patent Application No. 63/314,067, filed on Feb. 25, 2022, the disclosure of which is hereby incorporated herein by reference.
Number | Name | Date | Kind |
---|---|---|---|
5392279 | Taniguchi | Feb 1995 | A |
5412799 | Papadopoulos | May 1995 | A |
6674720 | Passint | Jan 2004 | B1 |
6907041 | Turner | Jun 2005 | B1 |
7768923 | Kumar | Aug 2010 | B2 |
7817549 | Kasralikar | Oct 2010 | B1 |
9203725 | Ronchetti | Dec 2015 | B2 |
10439952 | Mitulal | Oct 2019 | B1 |
20030198231 | Kalkunte | Oct 2003 | A1 |
20060182112 | Battle | Aug 2006 | A1 |
20080192634 | Kumar | Aug 2008 | A1 |
20100054268 | Divivier | Mar 2010 | A1 |
20130028265 | Ronchetti | Jan 2013 | A1 |
20130286825 | Sherlock | Oct 2013 | A1 |
20130329744 | Hachiya | Dec 2013 | A1 |
20140068637 | Chen | Mar 2014 | A1 |
20150106537 | Bobrek | Apr 2015 | A1 |
20160323166 | Pandey | Nov 2016 | A1 |
20170315932 | Moyer | Nov 2017 | A1 |
20180288145 | Levy | Oct 2018 | A1 |
20210073171 | Master | Mar 2021 | A1 |
20220294727 | Azzam | Sep 2022 | A1 |
Entry |
---|
M. Galles, “Spider: a high-speed network interconnect,” in IEEE Micro, vol. 17, No. 1, pp. 34-39, Jan.-Feb. 1997, doi: 10.1109/40.566196. |
D. Abts and D. Weisser, “Age-based packet arbitration in large-radix k-ary n-cubes,” SC '07: Proceedings of the 2007 ACM/IEEE Conference on Supercomputing, 2007, pp. 1-11, doi: 10.1145/1362622.1362630. |
International Search Report and Written Opinion for International Application No. PCT/US2022/048395 dated Feb. 8, 2023. 12 pages. |
Number | Date | Country | |
---|---|---|---|
20230275844 A1 | Aug 2023 | US |
Number | Date | Country | |
---|---|---|---|
63314067 | Feb 2022 | US |