Not applicable.
Not applicable.
As transistor and other component sizes become smaller and manufacturing techniques continue to improve, more functionality is being placed on single integrated circuits, or chips. The term system on a chip (SoC) generally refers to integrating all the functionality of a computer or other complex electronic system onto a single chip. A SoC may comprise one or more memories, processors, or input/output ports, all integrated into a single chip. One way of allowing various components of a SoC to communicate is to use an on-chip network, sometimes referred to as a network-on-chip. An on-chip network is intended to replace conventional ways of communicating between electronic components in a complex system, such as conventional bus and crossbar interconnections.
Various topologies have been considered for on-chip networks, and ring topologies are sometimes used because of the relative simplicity of the routers that may be employed. For example, in a unidirectional ring network each router comprises two ports, one input port for receiving data from a first adjacent router and one output port for transmitting data to a second adjacent router. These routers occupy less area, consume less power, and can be clocked at higher frequencies compared to higher-radix on-chip routers, such as routers in mesh networks. For example, the area and power consumption of a router may scale quadratically with the number of ports, so higher-radix routers may use substantially more power and occupy substantially more area than the relatively simple routers used in unidirectional ring networks. However, ring networks may not scale well as the number of routers increases. This is because the average and worst-case packet bandwidth increase linearly with the number of routers while the bisection bandwidth remains a constant, reducing the throughput of each router. Network latency may be critical for a number of SoC applications that require ultra low latency communication and operate under tight power budgets.
Disclosed herein is an apparatus comprising a chip comprising a global ring network comprising a plurality of global routers configured in a unidirectional ring network, and a plurality of local ring networks directly connected to the global ring network.
Also disclosed herein is a method comprising transmitting a first flit from a first router to a second router, wherein a first ring network comprises the first and second routers, and transmitting a second flit from the first router to a third router, wherein a second ring network comprises the first and third routers, wherein a chip comprises the first and second ring networks.
Also disclosed herein is an apparatus comprising a chip comprising a global ring network comprising a plurality of global routers configured in a unidirectional ring network, an intermediate ring network comprising a plurality of intermediate routers configured in a unidirectional ring network, wherein the intermediate ring network is directly connected to the global ring network, and a plurality of local ring networks directly connected to the intermediate ring network.
These and other features will be more clearly understood from the following detailed description taken in conjunction with the accompanying drawings and claims.
For a more complete understanding of this disclosure, reference is now made to the following brief description, taken in connection with the accompanying drawings and detailed description, wherein like reference numerals represent like parts.
It should be understood at the outset that although an illustrative implementation of one or more embodiments are provided below, the disclosed systems and/or methods may be implemented using any number of techniques, whether currently known or in existence. The disclosure should in no way be limited to the illustrative implementations, drawings, and techniques illustrated below, including the exemplary designs and implementations illustrated and described herein, but may be modified within the scope of the appended claims along with their full scope of equivalents.
Disclosed herein are topologies that utilize certain advantages of ring networks, including the use of simple two-port routers, while at the same time achieving lower latency than ring networks. The topologies may be referred to as hierarchical ring networks, and may be described as comprising a plurality of local ring networks interconnected via a global ring network. A global ring may comprise global routers, and a local ring may comprise local routers. Hierarchical ring networks reduce the average and worst-case packet latency compared to conventional ring networks, while still using simple two-port routers to connect adjacent stations, thereby reducing design time and routing latency while improving system performance. Various embodiments of hierarchical ring networks are described in the following.
An on-chip network may be configured to provide communication capability between various components that reside in a single chip.
The routers 114 may be any devices that promote routing of flits within the on-chip network 112. At least some of the routers 114 may break an incoming packet (e.g. an Internet Protocol (IP) packet or Ethernet frame) into units of information known as flow control digits, or flits, if such is not done by the components 118, 120, 122, and 124. Further, at least some of the routers 114 may reassemble the flits into an outgoing packet if such, is not done by the components 118, 120, 122, and 124. In addition, the routers 114 may perform flit routing in that they receive flits and determine which of a plurality of virtual channels on which to transmit the flits. In a similar manner, the routers 114 may perform packet routing in that they receive flits and determine which of a plurality of virtual channels on which to transmit the flits. As part of the routing, the routers 114 may arbitrate two flits or flits competing for a common resource (e.g. a virtual channel in a link 116). To perform these various functions, each router 114 may include a processor that is in communication with a memory, such as a read only memory (ROM), a random access memory (RAM), or any other type of memory. Each processor may be a general-purpose processor or may be an application-specific processor. Alternatively, at least some of the routers 114 may be implemented with no local memory, but have access to an external memory that may be located on another part of the SoC 100 and perhaps shared by other routers 114. Finally, at least some of the routers 114 may be implemented with no local memory and no memory access.
As discussed above, flits may be formed by segmenting packets, e.g., Ethernet packets or IP packets, that enter an on-chip network. A flit that enters an on-chip network may also be referred to as being injected into an on-chip network. Referring to
The links 116 may be any devices that carry flits between routers 114 and/or components 118, 120, 122, and 124. The links 116 are typically electrical links, but may be optical or wireless links. At least some of the links 116 may be divided into a plurality of virtual channels, for example, by segmenting available link 116 resources (e.g. time and/or frequency) into a plurality of slots (e.g. time slots and/or frequency slots) that carry the flits. Although in general the links in an on-chip network may be bidirectional, the methods and systems presented herein may be applicable to ring networks with unidirectional links.
The components 118, 120, 122, and 124 may be any type of devices that process the flits. Generally, the components 118, 120, 122, and 124 may be devices that perform some function that is more specialized than the functions performed by the routers. For example, the components 118, 120, 122, and 124 may include memories, processors, input/output (I/O) devices such as ingress or egress ports, or any other electronic components. While the routers 114 may comprise processors and/or memories, the capacity and/or throughput of the processors and/or memories in the components 118, 120, 122, and 124 typically greatly exceed those of the routers 114 such that it would be not be possible or practical for the routers 114 to perform the functions performed by the components 118, 120, 122, and 124. In cases where one of the components 118, 120, 122, and 124 is an ingress port, it may remove protocol layers from an incoming packet (e.g. an IP packet or Ethernet frame) and/or break the incoming packet into flits, if such is not done by the routers 114. In cases where one of the components 118, 120, 122, and 124 is an egress port, it may reassemble the flits into an outgoing packet (e.g. an IP packet or Ethernet frame), and/or add protocol layers to the outgoing packet, if such is not done by the routers 114.
The routers 114 and links 116 may be arranged in a ring topology, which may also be referred to as a ring network, as illustrated in
In a thirty-two router unidirectional ring network, the maximum latency is thirty-one router hops. That is, a flit must travel over a maximum of thirty-one links to reach its destination. Some flits will be injected into the ring network 200 close to their destination router, e.g., requiring one hop only, while other flits will be injected into ring network 200 relatively far away from their destination router, e.g., thirty-one hops away. Generally, the average latency in a thirty-two router unidirectional ring network is approximately fifteen router hops.
It is desirable to significantly reduce latency of ring network 200 without significantly increasing complexity. One topology that may achieve these goals is presented in
Local routers may be routers with similar structure and functionality to routers used in conventional ring networks, such as ring network 200 in
A global router may be a router comprising two input ports and two output ports. Specifically, a global router may have only one input port for receiving flits from another global router, one input port for receiving flits from a local router, one output port for transmitting flits to another global router, and one output port for transmitting flits to a local router. There may be no input ports or output ports for connecting to off-ring components, as global routers may not be coupled to off-ring components, such as memories or processors on a chip. Examples of global routers are presented in
As compared to the ring network 200 in
Although
Moreover, a hierarchical ring may comprise any number of local routers. For example, a hierarchical ring may comprise 128 local routers. And these routers may, for example, be divided into eight clusters of sixteen local routers each, in which case eight global routers may be used to interconnect the clusters. It is also not necessary for each cluster to contain the same number of local routers. In the example of 128 local routers above, there may, for example, be two clusters with eight local routers each and seven clusters with sixteen local routers each.
The hierarchical ring 300 presented in
Generally, intermediate rings may be introduced into a hierarchical ring to extend a hierarchical ring beyond two levels of rings to three or more levels of rings. Intermediate rings comprise intermediate routers, and an intermediate router may be a router comprising two input ports and two output ports. Specifically, an intermediate router may have only one input port for receiving flits from an adjacent intermediate router, one input port for receiving flits from an adjacent local router, one output port for transmitting flits to an adjacent intermediate router or an adjacent global router as the case may be, and one output port for transmitting flits to an adjacent local router. There may be no input ports or output ports for connecting to off-ring components, as intermediate routers may not be coupled to off-ring components, such as memories or processors on a chip. Examples of intermediate routers are presented in
Hierarchical ring 400 is but one of many possible configurations of hierarchical rings that include sixty-four local routers. Each of the configurations is within the scope of this application, and configurations with different numbers and/or configurations of local, intermediate, and global routers are also within the scope of this application.
Hierarchical rings may require new methods of routing because routers may be interconnected with more than one ring. For example, global router 412 in
The steps of
The embodiments of hierarchical ring networks disclosed herein are examples that utilize unidirectional links. For example, the embodiments of hierarchical networks 300 and 400 in
At least one embodiment is disclosed and variations, combinations, and/or modifications of the embodiment(s) and/or features of the embodiment(s) made by a person having ordinary skill in the art are within the scope of the disclosure. Alternative embodiments that result from combining, integrating, and/or omitting features of the embodiment(s) are also within the scope of the disclosure. Where numerical ranges or limitations are expressly stated, such express ranges or limitations should be understood to include iterative ranges or limitations of like magnitude falling within the expressly stated ranges or limitations (e.g., from about 1 to about 10 includes, 2, 3, 4, etc.; greater than 0.10 includes 0.11, 0.12, 0.13, etc.). For example, whenever a numerical range with a lower limit, Rl, and an upper limit, Ru, is disclosed, any number falling within the range is specifically disclosed. In particular, the following numbers within the range are specifically disclosed: R=Rl+k*(Ru−Rl), wherein k is a variable ranging from 1 percent to 100 percent with a 1 percent increment, i.e., k is 1 percent, 2 percent, 3 percent, 4 percent, 5 percent, . . . , 50 percent, 51 percent, 52 percent, . . . , 95 percent, 96 percent, 97 percent, 98 percent, 99 percent, or 100 percent. Moreover, any numerical range defined by two R numbers as defined in the above is also specifically disclosed. Use of the term “optionally” with respect to any element of a claim means that the element is required, or alternatively, the element is not required, both alternatives being within the scope of the claim. Use of broader terms such as comprises, includes, and having should be understood to provide support for narrower terms such as consisting of, consisting essentially of, and comprised substantially of. Accordingly, the scope of protection is not limited by the description set out above but is defined by the claims that follow, that scope including all equivalents of the subject matter of the claims. Each and every claim is incorporated as further disclosure into the specification and the claims are embodiment(s) of the present disclosure. The discussion of a reference in the disclosure is not an admission that it is prior art, especially any reference that has a publication date after the priority date of this application. The disclosure of all patents, patent applications, and publications cited in the disclosure are hereby incorporated by reference, to the extent that they provide exemplary, procedural, or other details supplementary to the disclosure.
While several embodiments have been provided in the present disclosure, it should be understood that the disclosed systems and methods might be embodied in many other specific forms without departing from the spirit or scope of the present disclosure. The present examples are to be considered as illustrative and not restrictive, and the intention is not to be limited to the details given herein. For example, the various elements or components may be combined or integrated in another system or certain features may be omitted, or not implemented.
In addition, techniques, systems, subsystems, and methods described and illustrated in the various embodiments as discrete or separate may be combined or integrated with other systems, modules, techniques, or methods without departing from the scope of the present disclosure. Other items shown or discussed as coupled or directly coupled or communicating with each other may be indirectly coupled or communicating through some interface, device, or intermediate component whether electrically, mechanically, or otherwise. Other examples of changes, substitutions, and alterations are ascertainable by one skilled in the art and could be made without departing from the spirit and scope disclosed herein.
The present application claims priority to U.S. Provisional Patent Application 61/438,869, filed Feb. 2, 2011 by Rohit Sunkam Ramanujam, et al., and entitled “Method and Apparatus for Low-Latency Interconnection Networks Using Hierarchical Rings,” which is incorporated herein by reference as if reproduced in its entirety.
Number | Date | Country | |
---|---|---|---|
61438869 | Feb 2011 | US |