This application is based upon and claims the benefit of priority from the prior Japanese Patent Application No. 2012-067923, filed on Mar. 23, 2012; the entire contents of which are incorporated herein by reference.
Embodiments described herein relate generally to ON-CHIP ROUTER AND MULTI-CORE SYSTEM USING THE SAME.
Multi-core systems tend to have long bus distribution to connect between a number of processor cores (hereafter simply referred to as “core”) or between the cores and memories. For that reason, it is difficult to secure wiring resources and to synchronize the timings of data transmission/reception. The maximum operating frequency of the multi-core systems is limited in order to synchronize the timings.
There is a technology referred to as a Network on Chip (NoC) as a technology to solve the above timing problem. In this technology, data is packetized as in Ethernet or other technologies, and the packet is transferred to a desired target (cores, memories) through an on-chip router (hereinafter simply referred to as “router”).
Packet transfer methods mainly include adaptive routing and source routing (also known as fixed routing). In the adaptive routing, a router that received a packet carries out route computation based on the destination address of the packet and determines the next transfer destination. The advantage of this routing method is a short header length in a packet, whereas the disadvantage is a heavy load of the route computation in relay routers.
On the other hand, in the source routing, a transmission source core (or router, initiator, or network interface) carries out all computations of a packet transfer route in advance, and stores information on the transfer route in a header. Once transfer route is determined at the beginning, the route is not dynamically changed in accordance with congestion information etc. of the route. The advantage of this routing method is a reduced load of the route computation in relay routes, whereas the disadvantage is a longer header length compared with the header length in the adaptive routing.
As described above, in the source routing, as the number of the relay routers increases, the header length of a packet becomes longer and the amount of buffer use increases, resulting in a problem of increase in latency.
An on-chip router of the embodiments is provided with plural input ports that receive packets, plural output ports that transmits the packets, plural buffers, each being provided so as to correspond to each of the input ports and accumulating at least a portion of the packets received through the input ports, a switching unit that switches the output destinations of the packets so that the packets are transmitted from any of the plural output ports, a header analyzer that has plural hop field extractors provided so as to correspond to each of the buffers, and a switching controller that controls the switching unit so that the packets are transmitted from an output port indicated by output port information of the hop field extracted by the hop field extractors.
Each of the hop field extractors receives an input of header information of the packet accumulated in the corresponding buffer, and extracts, from among plural hop fields that store output port information, a hop field that stores output port information indicating an output port from which the packet received through the input port is output.
The on-chip router of the embodiments is further provided with a header rewriter. The header rewriter receives an input of a packet output from the switching unit and rewrites output port information in a hop field that an on-chip router of the packet output destination uses to transfer the packet from among the plural hop fields into decoded output port information, and outputs a packet in which the output port information is rewritten to the output port.
Embodiments will now be explained with reference to the accompanying drawings.
Prior to explaining the embodiments relating to the present invention, packet transfer by using conventional source routing will be explained.
The cores 101-116 are connected to the memories 301-304 and the I/O port 401 through the routers 201-206. For example, the cores 101-104 are connected to the memories 301-304 through the router 201 and the router 205.
The cores 101-116 transmit a request packet to request reading and writing etc. to the memories 301-304. A memory that received the packet replies the packet that stores read data etc. to the source core that transmitted the request packet.
“Cmd” denotes a command field that indicates types of the packet (such as a read request, a write request, a read response, and a write response). “Dest Address” denotes a destination address field that stores destination addresses of the memories. “Src Core” denotes a source core field that stores an identifier of the source core transmitting a request packet.
“Hop OutPort” denotes a hop field that stores output port information. The output port information indicates an output port to output a received packet, and is, for example, an output port number. In the example of
In the source routing, as the number of relay routers increases, the number of hop fields increases. For that reason, the header length tends to become longer as the number of the relay router increases.
A router that first received a packet from the source core determines an output destination of the packet by referring to the output port information stored in “Hop1. OutPort”. In the example of
In the following description, embodiments according to the present invention will be explained with reference to the accompanying drawings. Note that in each of the drawings, the same reference numerals are assigned to the components having equivalent functions and detailed explanations of the components with the same reference numerals will not be repeated.
In the first embodiment, a packet that stores encoded output port information in the hop field is used. A router that received the packet decodes only output port information that will be used in a router in the next stage, rewrites the output port information in one-hot encoded format, and afterward transmits the output port information to the router in the next stage. As a result, it is possible to reduce processing time for packet transfer by determining the output port promptly while making the header length of a packet as short as possible.
Next, a schematic configuration of a router 10 according to the first embodiment will be explained with reference to
Each of the buffers 21a-21e is provided so as to correspond to the input ports 20a-20e, respectively, and accumulates at least a portion of the packets received through the input ports. It should be noted that the buffers have a capacity of one flit or more. The switching unit 22 inputs a packet from the buffers 21a-21e, and switches the output destinations of the packet so that the packet is transmitted from one of the output ports 26a-26e. The switching unit 22 selects any one of the output ports 26a-26e based on the output port information in the hop field extracted by the header analyzer 23 and switches to the output port.
The hop field extractor extracts a hop field that stores the output port information used in the own router from the header information of a packet accumulated in the corresponding buffer. In
The switching controller 24 controls the switching unit 22 so that a packet is transmitted from the output port indicated by the output port information of the hop field extracted by the hop field extractors 23a-23e. The switching controller 24 generates a selection signal by using the output port information in the extracted hop field.
The header rewriters 25a-25e decode the output port information in a hop field that is used for packet transfer by an on-chip-router that is the output destination of the packet (i.e., the router in the next stage) from among the plural hop fields of the input packet. The header rewriters output, to the selected output port, a packet in which the output port information in the hop field used by the router in the next stage is rewritten into the decoded output port information. It should be noted that the decoding processing of the output port information can be carried out in parallel with the processing of data stored in the body flit of the packet. For that reason, the decoding processing does not cause the increase in latency.
Meanwhile, the header rewriters may delete the hop field used for packet transfer in order to make the header length as short as possible. In the case of the aforementioned example, the header rewriter 25a of the router 201 deletes “Hop1 OutPort” so that the hop fields of “Hop2 OutPort” and the subsequent hop fields are moved toward the beginning of the header.
Next, detailed configurations of the switching unit 22, the header analyzer 23 (hop field extractors 23a-23e) and the switching controller 24 will be explained by using
The switching unit 22 includes multiplexers 22a-22e, which is provided so as to correspond to the output port 26a-26e, respectively. Each of the multiplexers 22a-22e is connected to all of the buffers 21a-21e.
Each of the hop field extractors 23a-23e analyzes the header information of the packets accumulated in the corresponding one of the buffer 21a-21e and transmits the output port information (one-hot encoded format) stored in the extracted hop field to the switching controller 24.
The switching controller 24 generates a selection signal by using the output port information in one-hot encoded format and transmits the signal to the multiplexers. The multiplexers outputs the packet accumulated in any of the buffers 21a-21e to the header rewriter connected to the corresponding output port based on the selection signal generated in the switching controller 24.
It should be noted that the switching controller 24 preferably has a function to arbitrate the use of the output ports. In other words, when plural packets are received through different input ports and those packets have the same output destination, the switching controller 24 controls the switching unit 22 so that those packets are transmitted on the basis of a prescribed rule. For example, the switching unit 22 is controlled so as to output the received plural packets in order of a packet accumulated in the buffer 21a, a packet in buffer 21b, a packet in the buffer 21c, a packet in the buffer 21d, a packet in the buffer 21e, a packet in the buffer 21a, . . . . Another rule may be such that by setting the priority of each input port, the switching unit 22 may be controlled so that a packet received through an input port with higher priority is more preferentially transmitted. Or by setting the priority of each packet, the switching unit 22 may be controlled so that a packet with higher priority is more preferentially transmitted. In the first embodiment, the header length is reduced by storing the encoded output port information in the hop field. Furthermore, the output port information in the hop field that is used by the router in the next stage is written into decoded information (i.e., information in one-hot encoded format) and is transmitted to the router in the next stage. In other words, the output port information used in the own router had been rewritten in a decoded one-hot encoded format in the router in the previous stage. Consequently, the router that received a packet does not need to decode the output port information in the header analyzer 23 and can promptly determine the output port. In addition, when the hop field used in the own router is deleted, the header length is not increased by the decoding. As a result, according to the first embodiment, it is possible to make the header length as short as possible and to reduce the latency. Furthermore, because the decoding processing in the header analyzer 23 is no longer necessary, the header analyzer 23 and the switching controller 24 can be realized with simple and high-speed circuits.
The second and third embodiments that will be explained below compress route information in a packet by using the traffic bias in a network.
The second embodiment uses a packet provided with a valid flag indicating validity/invalidity of the hop field. A router that received the packet transmits the packet from a default output port when a valid hop field is not present in the header of the received packet.
A router that received a packet, when a valid flag of a hop field corresponding to the own router indicates as valid, outputs the packet from the corresponding output port based on the output port information stored in the hop field. Meanwhile, when the valid flag indicates as invalid, the packet is output from the default output port.
In the following description, operations when the packet in
In the meantime, when a prescribed application is operated, a core makes frequent access to a particular memory etc., and this causes packets to frequently route through a particular route. In such a case, the router controls the switching unit so as to output the packets from a particular (default) output port. For example, this particular output port in each router is set to “0”. Here, this particular output port may be set in each router.
Operations carried out when the packet in
Because a valid hop field is not present, the router 205 transmits the packet from the output port “0” to the memory 301.
In the second embodiment, valid flags indicating validity/invalidity are provided to hop fields, and when a valid hop field is not present, a packet is transmitted from the default output port. As a result, when a packet is output from the default output port, route information can be omitted, and the header length of the packet can be reduced.
Next, an example of a router configuration according to the second embodiment will be explained by using
The router 10A in
The switching controller 24 illustrated in
The header analyzer 27 has the output port selectors 27a-27e, each of which is provided to correspond to each of the buffers 21a-21e. In this manner, as a result of providing the output port selectors so that the number of the output port selectors is the same as the number of the input ports, it becomes possible to transfer packets and to transmits the packet from plural output ports at the same time even when plural input ports receive packets at the same time.
An output port selector determines whether the valid flag of the hop field corresponding to the own router is valid or invalid from the header information of the packet accumulated in the corresponding buffer. When the valid flag of the hop field corresponding to the own router indicates as valid, the output port information stored in the hop field is selected. On the other hand, when a valid hop field is not present in the header, the default output port information is selected.
As illustrated in
The multiplexer 34 outputs the output port information from the hop field extractor 31 or the setting register 33 to the switching controller 24 in accordance with a signal indicating validity/invalidity from the valid flag extractor 32. More specifically, the multiplexer 34 outputs a signal of the hop field extractor 31 when “1” is input from the valid flag extractor 32, and outputs a signal from the setting register 33 when “0” is input from the valid flag extractor 32.
As explained above, in the second embodiment, the route information stored in a header is compressed by using the access traffic bias. As a result, the header length of a packet can be reduced and the latency at the time of accessing a memory or an I/O port can be also reduced. Furthermore, the power consumption can be reduced.
It should be noted that the router according to the second embodiment may have a header rewriter that deletes a hop field storing the output port information used for packet transfer in the switching unit. This header rewriter transmits the packet from which the hop field is deleted to an output port. As a result, the header length can be reduced every time a packet routes through a router.
In the third embodiment, a router identifier field is provided instead of a valid flag. When an output port is designated in a router, an identifier of the router is stored in the router identifier field, and output port information is stored in the corresponding hop field. Meanwhile, in case of a (default) router in which an output port is not designated, an invalid router identifier is stored in the router identifier field, or a router identifier field is not used.
When a router receives a packet, the router searches for a router identifier field that stores its own identifier. When a router identifier field that stores its own identifier is found as a result of the search, the router transmits the packet from the output port stored in the corresponding hop field. On the other hand, when a field that stores the identifier of the own router is not found, the router transmits the packet from the default output port.
In the following description, operations carried out when a packet in
The packet in
In the following description, operations carried out when a packet in
Firstly, the router 201 searches in router identifier fields. When the router identifier field that matches its own identifier is not found as a result of the search, the router 201 transmits the packet to the router 205 from the default output port “0”. The router 205 searches in the router identifier field. When a router identifier field “Hop Router ID” that matches its own identifier is found as a result of the search, the router 205 transmits the packet to the memory 301 from the output port “0” stored in the corresponding “Hop OutPort”. In this manner, packets transmitted from the core 101 are transferred to the memory 301.
In the third embodiment, a frequently-accessed output port on a route is set to default, and the packet route information is, omitted. As a result, it is possible to transfer packets by using packets with a short header length.
Next, regarding a configuration of a router according to the third embodiment, only the difference from the router according to the second embodiment will be explained. The router according to the third embodiment has a header analyzer 28 instead of the header analyzer 27 in the router 10A. This header analyzer 28 has plural output port selectors 28a-28e provided so as to correspond to buffers 21a-21e, respectively.
An output port selector, if a router identifier field storing an identifier identical with the identifier of the own router is present in the received packet header, selects the output port information stored in the hop field corresponding to the router identifier field, or if not, selects the default output port information.
The router identifier field extractor 35 extracts a value of an identifier stored in a router identifier field from a header. When the header includes plural router identifier fields, the router identifier field extractor 35 extracts all of the values of the identifiers stored in every router identifier fields.
The comparator 36 compares the extracted value with the identifier of the own router, outputs “1” when the two values match, and outputs “0” when the two values do not match. When the router identifier field extractor 35 extracted plural values, the comparator 36 searches for a value that matches the identifier of the own router, outputs “1” when the value that matches the identifier of the own router is found, and outputs “0” when the value is not found.
The hop field extractor 31 extracts a hop field corresponding to the router identifier field that stores the own router ID.
The multiplexer 37 outputs the output port information of the hop field extractor 31 or the setting register 33 to the switching controller 24 in accordance with the signal from the comparator 36. More specifically, the multiplexer 37 outputs the output port information from the hop field extractor 31 when “1” is input from the comparator 36, and outputs the output port information from the setting register 33 when “0” is input from the comparator 36.
According to the third embodiment, an output port is determined by whether or not the own identifier is present in a header. In other words, the setting of the router identifier field can be omitted in an output to an output port that can be set as default. For example, in
It should be noted that the router according to the third embodiment may have a header rewriter that deletes a hop field storing the output port information used for packet transfer in the switching unit. This header rewriter transmits the packet from which the hop field is deleted to an output port. As a result, the header length can be reduced every time a packet routes through a router.
As explained above, in a multi-core system having a bias in access frequency only in a portion of the packet transfer route, the third embodiment can compress the route information stored in a header by using access traffic bias. As a result, the header length of a packet can be reduced and the latency at the time of accessing a memory or an I/O port can be also reduced. Furthermore, the power consumption can be reduced.
Three embodiments according to the present invention were explained above. The second and third embodiments, more generally, select an output port based on a determination field (the valid flag or the router identifier field) corresponding to a hop field. In other words, the header analyzer (the output port selector) analyzes the header information of a received packet and selects default output port information or output port information stored in a hop field based on the determination field.
It should be noted that the network configuration of the multi-core system of the present invention is not limited to the tree topology illustrated in
While certain embodiments have been described, these embodiments have been presented by way of example only, and are not intended to limit the scope of the inventions. Indeed, the novel embodiments described herein may be embodied in a variety of other forms; furthermore, various omissions, substitutions and changes in the form of the embodiments described herein may be made without departing from the spirit of the inventions. The accompanying claims and their equivalents are intended to cover such forms or modifications as would fall within the scope and spirit of the inventions.
Number | Date | Country | Kind |
---|---|---|---|
2012-067923 | Mar 2012 | JP | national |