The present technology is in the field of computer system design and, more specifically, related to a network packet routing scheme for a network-on-chip (NoC).
Within a network-on-chip (NoC), packets may be transported from an initiator to a target. Networks-on-chip (NoCs) are built by assembling various components such as switches, buffers, network interfaces, adapters, etc.
In general, within a switch, each output port of the switch has an arbiter that determines which packet to output on a given output port. For example, the arbiter can have two packets contenting to be output and the arbiter will decide which packet to output. Various approaches resolve this contention in an arbiter. For example, with round robin (RR), each contending packet is assigned a position in a wait list and contending packets are granted access to the output port in that order; the start of the list being always set after the last port that was served. Other examples include, first in first out (FIFO), priority based, weighted round robin (WRR), etc.
NoC designers need the arbiters to resolve contention fairly. In general, arbiters resolve contentions in a fair manner, which means every packet has approximately the same chance to win arbitration, and no packet is ever blocked for an infinite period of time. Even though each arbiter chooses in a fair manner, paths within the NoC that contain many switches and other packet routing components can produce an unfair route. For example, when a path has many switches, the probability of a given path going through is lowered especially on networks with a large amount of data. To help make arbitration less sensitive to the depth of the switch cascade packets go through, solutions can include regulation at the initiator, dynamic priority of packets based on obtained bandwidth versus a set goal, and other mechanisms such as pressure through the cascade. However, these mechanisms typically do not work well in situations where cascades are very deep such as in mesh-type topologies where packets need to go through many nodes (e.g., switches with arbiters) between an initiator and a target. Furthermore, these mechanisms typically focus on request arbitration and do not address the situation of corresponding response packet arbitration.
For the NoC to run efficiently, the time a packet takes to be routed through the network needs to be managed within an acceptable standard. For example, an acceptable standard could be a certain level of quality of service for the packet sender. Therefore, what is needed is an arbitration scheme that allows packets to be delivered to a target within a certain time interval.
In accordance with various embodiments and aspects of the invention, an arbitration scheme is disclosed that allows packets to be delivered to a target within a certain time interval or a deadline. The disclosed systems and methods allow packets to be routed or delivered to a target within a certain time interval or by a certain deadline. According to one or more aspects and embodiments of the invention, a deadline is created for a packet to be delivered; the packet is created that includes the deadline. The deadline may be part of the packet header. The packet is routed within a network-on-chip (NoC) using the deadline. For example, when two packets are contending for an output port, the deadline may be used to arbitrate which packet is to progress. For another example, when two packets are contending for an output port, the packet with the smallest deadline is the packet that will progress. The deadline may be adjusted at each component the packet passes through. For example, the deadline may be given a fixed value at the beginning and the deadline decremented at each component the packet passes through.
According to one or more aspects and embodiments of the invention, when a packet is created, a deadline is calculated based on desired routing and the deadline is set as a field in the header. As the packet is routed through the NoC, the deadline is decremented when a packet is not selected by an arbiter to progress. At an arbiter, the packet with the smallest deadline is chosen to progress.
According to one or more aspects and embodiments of the invention, a packet can have a deadline and a priority. A combination of deadline and/or priority may be used to determine a packet to progress when multiple packets are contending for an output port. According to one or more aspects and embodiments of the invention, when two packets are contending for an output port and both packets have the same priority, the packet with the smallest deadline is the packet that will progress. According to one or more aspects and embodiments of the invention, when two packets are contending for an output port and both packets have the same deadline, the packet with the highest priority is the packet that will progress.
According to one or more aspects and embodiments of the invention, when two or more packets have the same deadline and are contending for an output port, an arbitration scheme is used to determine the packet to progress. According to one or more aspects and embodiments of the invention, a deadline may be set to a reserved value and arbitration may depend on the reserved value. For example, a packet with a deadline that is a reserved value may lose arbitration to another packet whose deadline is not a reserved value. According to one or more aspects and embodiments of the invention, packets make include packets that have a deadline and packets that do not have a deadline. Routing may be based on if a packet has a deadline. For example, packets with a deadline may win arbitration over a packet without a deadline.
According to one or more aspects and embodiments of the invention, one or more request packets correspond to one or more response packets. For example, an initiator sends a request to a target and the target returns a response. A deadline is calculated in such a way as to set a time that a response will be received. When the request is created, the deadline is included. As the request is routed to the destination using the deadline, the deadline may be adjusted by the components the packet passes through. While at the destination, the deadline is adjusted to account for the time to create a response. When the response is created, the deadline is included in the response. As the response is routed to the initiator using the deadline, the deadline is adjusted. According to one or more aspects and embodiments of the invention, the target stores the deadline in a context table and adjusts the deadline in the context table while waiting on a response.
According to one or more aspects and embodiments of the invention, a deadline may be adjusted by a variable amount. For example, the packet with a smaller deadline (e.g., older packet) may have the deadline decremented by a larger value. For another example, packets with a higher priority have the deadline decremented with a larger value compared to packets with a lower priority.
According to one or more aspects and embodiments of the invention, the value of the deadline can be used to monitor network performance. According to one or more aspects and embodiments of the invention, the remaining deadline of packets may be adjusted to achieve a desired network performance.
In order to understand the invention more fully, reference is made to the accompanying drawings. The invention is described in accordance with the aspects and embodiments in the following description with reference to the drawings or figures (FIG.), in which like numbers represent the same or similar elements. Understanding that these drawings are not to be considered limitations in the scope of the invention, the presently described aspects and embodiments and the presently understood best mode of the invention are described with additional detail through use of the accompanying drawings.
The following describes various examples of the present technology that illustrate various aspects and embodiments of the invention. Generally, examples can use the described aspects in any combination. All statements herein reciting principles, aspects, and embodiments as well as specific examples thereof, are intended to encompass both structural and functional equivalents thereof. The examples provided are intended as non-limiting examples. Additionally, it is intended that such equivalents include both currently known equivalents and equivalents developed in the future, i.e., any elements developed that perform the same function, regardless of structure.
It is noted that, as used herein, the singular forms “a,” “an” and “the” include plural referents unless the context clearly dictates otherwise. Reference throughout this specification to “one embodiment,” “an embodiment,” “certain embodiment,” “various embodiments,” or similar language means that a particular aspect, feature, structure, or characteristic described in connection with the embodiment is included in at least one embodiment of the invention.
Thus, appearances of the phrases “in one embodiment,” “in at least one embodiment,” “in an embodiment,” “in certain embodiments,” and similar language throughout this specification may, but do not necessarily, all refer to the same embodiment or similar embodiments. Furthermore, aspects and embodiments of the invention described herein are merely exemplary, and should not be construed as limiting of the scope or spirit of the invention as appreciated by those of ordinary skill in the art. The disclosed invention is effectively made or used in any embodiment that includes any novel aspect described herein. All statements herein reciting principles, aspects, and embodiments of the invention are intended to encompass both structural and functional equivalents thereof. It is intended that such equivalents include both currently known equivalents and equivalents developed in the future.
The terms “source,” “master,” and “initiator” refer to similar intellectual property (IP) modules/blocks or units; these terms are used interchangeably within the scope and embodiments of the invention. As used herein, the terms “sink,” “slave,” and “target” refer to similar IP modules or units and the terms are used interchangeably within the scope and embodiments of the invention. As used herein, a transaction may be a request transaction or a response transaction. Examples of request transactions include write request and read request.
Various references are made herein to integrated circuits (ICs) and the designs of ICs. One example of an IC is a multiprocessor system that is implemented in systems-on-chip (SoCs) that communicates through networks-on-chip (NoCs). The SoCs include instances of initiator intellectual properties (IPs) and target IPs. Transactions are sent from an initiator to one or more targets using industry-standard protocols. The initiator, connected to the NoC, sends a request transaction to a target or targets, using an address to select the target or targets. The NoC decodes the address and transports the request from the initiator to the target. The target handles the transaction and sends a response transaction, which is transported back by the NoC to the initiator. As such, the SoC and NoC include complexity and configurability, especially in situation when the NoC is configurable.
Referring now to
Referring now to
Referring now to
Even though each arbiter in the path between master 224 and target 222 may be fair, the cascaded combination of switches may not be fair. For example,
Referring now to
Referring now to
Network Interface Units (NIUs) may build and send packets in a NoC. An NIU may compute a deadline for a packet. The computation of the deadline may depend on where the NIU is located. The computation of the deadline may be dependent on the goals in terms of quality of service.
At step 420, a packet is created that includes the deadline. According to one or more aspects and embodiments of the invention, the deadline is added to the packet header. According to one or more aspects and embodiments of the invention, at request packet creation, a deadline is computed and added to the header of the packet. According to one or more aspects and embodiments of the invention, at request packet creation, a deadline is computed and added to the address or part of the payload of the packet.
At step 430, the packet is routed within the NoC using or based on the deadline. According to one or more aspects and embodiments of the invention, when a packet is not chosen to be outputted from a switch, the deadline may be decremented by a value. In other words, when a packet loses arbitration (with respect to another packet), the deadline may be decremented by a value. According to one or more aspects and embodiments of the invention, for the deadline to be decremented, a packet must have not been chosen for a fixed number of clock cycles. According to one or more aspects and embodiments of the invention, the decremented value may be a fixed value. The decremented value may be s(SWx). The value of ε(SWx) may be unique for each component (e.g., switch) in the NoC. The value of ε(SWx) may represent the time the packet has waited at the arbiter. The value of ε(SWx) may be changed over time. The value of ε(SWx) may depend on the operating frequency (clock domain) of the switch. The value of ε(SWx) may depend on how many clock cycles the switch requires to make an arbitration decision.
According to one or more aspects and embodiments of the invention, an arbiter inside a switch makes an arbitration decision based on the deadline of each contenting packet. The arbiter may select the packets with the smallest deadlines with the highest priority. The arbiter may select the packet that has been in the NoC the longest. When multiple packets have the same deadline, a secondary arbitration mechanism (e.g., round robin, or other arbitration scheme) may be used to resolve the tie.
According to one or more aspects and embodiments of the invention, the sorting function that the arbiter uses to find the smallest deadline value can be precise or imprecise. Imprecise comparisons may be simpler and faster in hardware implementation. Imprecise comparisons might consider equal all the values inside an interval. According to one or more aspects and embodiments of the invention, any sorting function may be used.
According to one or more aspects and embodiments of the invention, in addition to arbiters, each NoC component (e.g., buffers and adapters), which can store a packet for more than one clock cycle, may modify the deadline. For example, for a packet waiting inside the component Ux for a given number of clock cycles (e.g., one or more clock cycles), the value of the deadline for that packet, which may be the header, can be decremented by a fixed number ε(Ux) representing the time the packet has waited inside this component Ux.
According to one or more aspects and embodiments of the invention, at request packet creation, a deadline is computed and added to the header of the packet. As the packet flows through the NoC, at each stop for arbitration, the deadline is decremented for a value depending on the time the packet is waiting. In an arbiter within the NoC, a packet with the smallest deadline (of all the packets waiting for selection) is chosen and the chosen packet is transmitted. When multiple packets have the smallest deadline, an arbitration scheme is used on the smallest deadline packets to choose which packet to transmit.
According to one or more aspects and embodiments of the invention, the deadline may be adjusted by each component a packet passes through. For example, the component may decrement the deadline of the packet. For another example, the component may increment the deadline of the packet.
According to one or more aspects and embodiments of the invention, deadlines may be set on a per packet basis. Arbitration may be determined based on the presence of the deadline or the absence of the deadline. According to one or more aspects and embodiments of the invention, the deadline may be set to a value that represents infinity. For example, the deadline may be set to a reserved word for infinity. Arbitration may be determined based on deadline being infinity. For example, arbitration may resolve in favor of packets that have a deadline that is not infinity. According to one or more aspects and embodiments of the invention, the deadline may be set to a value that will not be adjusted and will always lose arbitration to a packet with a deadline not set to the value.
Referring now to
Referring now to
Referring now to
At step 720, a packet is created that includes a deadline. According to one or more aspects and embodiments of the invention, step 720 may be the same or similar to step 420. According to one or more aspects and embodiments of the invention, step 720 may be the same or similar to step 520.
At step 730, the packet is routed within the NoC, using the deadline, to a target. According to one or more aspects and embodiments of the invention, step 730 may be the same or similar to step 430. According to one or more aspects and embodiments of the invention, step 730 may be the same or similar to step 530.
At step 740, at the target, the deadline is adjusted while waiting on a response. For example, the deadline may be decremented while waiting on the response. For another example, when the initiator is a central processing unit (CPU) and the target is an analog to digital converter (ADC), then the packet may include the command for the ADC to perform an analog to digital conversion. While the ADC is performing the conversion, the deadline is decremented to account for the time taken for the conversion to be performed. According to one or more aspects and embodiments of the invention, the deadline may be stored in a context table. The deadline in the context table may be adjusted while waiting on a response.
At step 750, a response packet is created using the deadline. To continue the ADC example, the response packet is created to include the ADC conversion value and the deadline. According to one or more aspects and embodiments of the invention, the deadline may be retrieved from a context table.
At step 760, the response packet is routed to the initiator using the deadline for the response or the original deadline adjusted for the time needed to travel to the target. According to one or more aspects and embodiments of the invention, step 760 is the same or similar to step 730 except step 760 routes from the target to the initiator.
Referring now to
According to one or more aspects and embodiments of the invention, in a request and response sequence, the packets in one direction may require corresponding packets in another direction, which can create a fixed packet sequence. A request packet may be followed by a corresponding response packet which creates a request to response packet sequence. A sequence may include multiple requests and/or responses.
According to one or more aspects and embodiments of the invention, the initial value of the deadline encodes the expected maximum time for a sequence of packets to finish from initial request to final received response. According to one or more aspects and embodiments of the invention, arbiters and other NoC components adjust the deadline value. For example, decrement the deadline value.
According to one or more aspects and embodiments of the invention, when a packet P[i] reaches a target D[i], which is an intermediate step in a sequence of packets (P[0], P[1], ... P[i],...P[n]), then the target decrements the deadline by the time elapsed between the arrival of the incoming packet P[i] and the departure of the following packet P[i+1] in the sequence, and uses this value as the new initial value for the deadline in that packet P[i+1].
According to one or more aspects and embodiments of the invention, if a request packet P0 is sent by a network interface unit NIU0 connected to an initiator I0, to a Network Interface Unit NIU1 connected to a target T0; and the protocol expects a response packet P1 in return, sent back by NIU1 to NIU0, then NIU0 will compute a deadline value that represents the maximum time allowed for the sequence of packets (P0, P1). When P0 reaches NIU1, NIU1 records the value of the deadline for the incoming packet P0, then measures the time taken by the target T0 to respond and NIU1 to create the response packet P1. The response packet P1 deadline value is computed as the incoming deadline value of P0, minus the time elapsed for target T0 to respond and NIU1 to create the response packet P1. Then packet P1 is sent back to NIU0.
According to one or more aspects and embodiments of the invention, NIU0 may also use the deadline value of P1 at arrival, to perform further regulation activities, such as changing the way new deadline values for subsequent request packets based on the average value of the deadline of previous response packets.
According to one or more aspects and embodiments of the invention, at request packet creation a deadline is calculated and set in the header of the packet. As the request packet flows through the NoC, at each stop for arbitration, the deadline is decremented for a value depending on the time the packet is waiting. In an arbiter with the NoC, a packet with the smallest deadline of all the packets waiting for selection is chosen and the chosen packet is transmitted. After packet arrives at target, the deadline is stored in a context table. While waiting for response to arrive, the deadline values in the context table is decremented. When the corresponding response arrives then the deadline value is fetched of response packet from context table. As the response packet flows through the NoC, at each stop for arbitration, the deadline is decremented for a value depending on the time the packet is waiting. In an arbiter with the NoC, a packet with the smallest deadline of all the packets waiting for selection is chosen and the chosen packet is transmitted.
According to one or more aspects and embodiments of the invention, in a request and response cycle, the header may be given a response deadline that will be used for the response.
According to one or more aspects and embodiments of the invention, the deadline mechanism can be extended beyond the request and response to any sequence where packets are created. For example, a fixed sequence of creating packets. For another example, the deadline of future packets is adjusted based in the deadline of the packet as the packet arrives at a destination. This adjustment can be carried to future deadline creation in order to optimization of the NoC
According to one or more aspects and embodiments of the invention, the deadline can be combined with priority. Packet’s headers can have a priority field, indicating how urgent the packet is. Then, instead of decrementing by a fixed amount the deadline values of waiting packet, the higher priority packets may be decremented by a higher value. In other words, higher priority packets age faster. A potential benefit is to allow a packet to cross the NoC faster.
According to one or more aspects and embodiments of the invention, deadline values of incoming packets represent the difference between the maximum time the initiator has given the packets to arrive at the target, and the time they have spent inside the network. This difference can be used to control regulation mechanisms such as credits exchanged between target and initiator. When a packet arrives with time left beyond the initial time budget, subsequent packets from the same initiator can be given less credits by a target, since the assumption can be made that this initiator is getting the expected service level. Alternatively, when a packet arrives with little time left compared to the initial budget, subsequent packets from the same initiator can be given more credits by a target, since the assumption can be made that this initiator is getting below the expected service level.
According to one or more aspects and embodiments of the invention, statistics counters can be implemented that capture information about the value of the received packet’s deadlines. The user can monitor these counters to implement additional regulation. Any metric may be measured. For example, minimum, average, maximum values of the remaining deadline (e.g., slack) of incoming packets, histograms, etc.
According to one or more aspects and embodiments of the invention, the deadline based arbitration may also provide a way to observe overall system behavior even when the deadline field would not be used as a way to improve arbitration. Measuring transport latency accurately requires tracking tables at each ingress point and measuring outbound and return latency of packets is not possible without additional logic. These tracking tables can be large. The deadline field provides a way to observe the travel time of any individual packet and collect statistics based on initiator, target, address, or other properties.
According to one or more aspects and embodiments of the invention, the deadline arbitration scheme is useful in the field of large servers, artificial intelligence computations, and deep network accelerators. The deadline arbitration scheme works well for mesh NoC topologies, which are common in these applications. Mesh topologies expose deep cascades of switches, and deadline-based arbitration is giving better results for deep cascades of switches (e.g., arbiters).
According to one or more aspects and embodiments of the invention, an arbitrator can include a queue of packets contending for the output and the arbitrator can inspect the deadline field of each header to determine which packet to let progress to the output port. For example, the smallest deadline could be allowed to progress.
According to one or more aspects and embodiments of the invention, packets that do not have a timeframe to be “delivered in” may have the deadline header field set to a value to indicate the deadline is not to be used in arbitration.
According to one or more aspects and embodiments of the invention, any system for tracking time can be used.
Referring now to
Referring now to
Referring now to
Referring now to
Referring now to
Certain methods according to the various aspects of the invention may be performed by instructions that are stored upon a non-transitory computer readable medium. The non-transitory computer readable medium stores code including instructions that, if executed by one or more processors, would cause a system or computer to perform steps of the method described herein. The non-transitory computer readable medium includes: a rotating magnetic disk, a rotating optical disk, a flash random access memory (RAM) chip, and other mechanically moving or solid-state storage media. Any type of computer-readable medium is appropriate for storing code having instructions according to various examples and aspects of the invention.
Certain examples have been described herein and it will be noted that different combinations of different components from different examples may be possible. Salient features are presented to better explain examples; however, it is clear that certain features may be added, modified, and/or omitted without modifying the functional aspects of these examples as described.
Practitioners skilled in the art will recognize many modifications and variations. The modifications and variations include any relevant combination of the disclosed features. Descriptions herein reciting principles, aspects, and embodiments encompass both structural and functional equivalents thereof. Elements described herein as “coupled” or “communicatively coupled” have an effectual relationship realizable by a direct connection or indirect connection, which uses one or more other intervening elements. Embodiments described herein as “communicating” or “in communication with” another device, module, or elements include any form of communication or link and include an effectual relationship. For example, a communication link may be established using a wired connection, wireless protocols, near-filed protocols, or RFID.
Certain methods according to the various aspects of the invention may be performed by instructions that are stored upon a non-transitory computer readable medium. The non-transitory computer readable medium stores code including instructions that, if executed by one or more processors, would cause a system or computer to perform steps of the method described herein. The non-transitory computer readable medium includes: a rotating magnetic disk, a rotating optical disk, a flash random access memory (RAM) chip, and other mechanically moving or solid-state storage media. Any type of computer-readable medium is appropriate for storing code having instructions according to various examples and aspects of the invention.
Certain examples have been described herein and it will be noted that different combinations of different components from different examples may be possible. Salient features are presented to better explain examples; however, it is clear that certain features may be added, modified, and/or omitted without modifying the functional aspects of these examples as described.
To the extent that the terms “including”, “includes”, “having”, “has”, “with”, or variants thereof are used in either the detailed description and the claims, such terms are intended to be inclusive in a similar manner to the term “comprising.”
The scope of the invention, therefore, is not intended to be limited to the exemplary embodiments and aspects that are shown and described herein. Rather, the scope and spirit of the invention is embodied by the appended claims.
This application claims priority under 35 USC 119 to U.S. Provisional App. Serial No. 63/249,033 filed on Sep. 28, 2021 and titled DEADLINE BASED ARBITRATION IN A NETWORK-ON-CHIP by Benoit DE LESCURE et al., the entire disclosure of which is incorporated herein by reference.
Number | Date | Country | |
---|---|---|---|
63249033 | Sep 2021 | US |