Crossbars are used to connect each of a first set of ports with each of a second set of ports. The ports are generally connected via a full mesh within the crossbar. For example, the crossbar may include source ports and destination ports. Each source port is connected via the mesh with each destination port. Although this allows full connectivity between the ports, the number of wires within the mesh increases exponentially with the number of ports. As a result, larger numbers of wires are required to be routed within an amount of space that is desired to remain small. Consequently, scaling the crossbar may be challenging. Accordingly, what is needed is a mechanism for transferring data between large numbers of ports.
Various embodiments are disclosed in the following detailed description and the accompanying drawings.
The disclosure can be implemented in numerous ways, including as a process; an apparatus; a system; a composition of matter; a computer program product embodied on a computer readable storage medium; and/or a processor, such as a processor configured to execute instructions stored on and/or provided by a memory coupled to the processor. In this specification, these implementations, or any other form that the disclosure may take, may be referred to as techniques. In general, the order of the steps of disclosed processes may be altered within the scope of the disclosure. Unless stated otherwise, a component such as a processor or a memory described as being configured to perform a task may be implemented as a general component that is temporarily configured to perform the task at a given time or a specific component that is manufactured to perform the task. As used herein, the term ‘processor’ refers to one or more devices, circuits, and/or processing cores configured to process data, such as computer program instructions.
A detailed description of one or more embodiments is provided below along with accompanying figures that illustrate the principles of the disclosure. The disclosure is described in connection with such embodiments, but the disclosure is not limited to any embodiment. The scope of the disclosure is limited only by the claims and the disclosure encompasses numerous alternatives, modifications and equivalents. Numerous specific details are set forth in the following description in order to provide a thorough understanding of the disclosure. These details are provided for the purpose of example and the disclosure may be practiced according to the claims without some or all of these specific details. For the purpose of clarity, technical material that is known in the technical fields related to the disclosure has not been described in detail so that the disclosure is not unnecessarily obscured.
Various applications require each of a first set of circuit elements to be connected to each of a second set of circuit elements. For example, each processing engine in a set of processing engines may be desired to be connected to each cache in a set of caches. Crossbars are one mechanism for accomplishing this connection. A crossbar generally includes multiple data ports and a full mesh interconnecting the data ports. Data ports of a given type have connectivity to any of the data ports of another type through the full mesh. The data ports may be connected to the other elements in an integrated circuit between which data is desired to be transferred. A crossbar is also generally laid out such that its data ports align with the ports of the elements the crossbar interfaces with. For example, a crossbar may be used to connect a set of processing engines with a set of memories, such as caches. The data ports on a first side of the crossbar are connected with the processing engines’ ports, while the data ports on the opposite side of the crossbar are connected with the corresponding caches’ ports. The full mesh within the crossbar connects each data port for a processing engine with all data ports for the caches, and vice versa.
Although the crossbar allows for connectivity between elements of an integrated circuit, there are drawbacks. The number of wires in the full mesh increases exponentially with the number of data ports. Further, each data port may carry hundreds of signals. Thus, the number of wires increases rapidly with the number of data ports. For example, suppose there are three types of data ports (A, B, C) which are desired to be connected (each data port of each type connected to each data port of another type). The number of wires routed in the full mesh for the crossbar is (bus width)* [number of A data ports *(number of B data ports + the number of C data ports) + number of B data ports*(number of C data ports + number of A data ports) + number of C data ports*(number of A data ports + number of B data ports)]. If the number of A data ports is 8, the number of B data ports is 8, the number of C data ports is 2, and the bus width is 500 wires, the number of wires routed is 96,000. Thus, the number or wires required to be routed in the full mesh increases exponentially with the number of data ports. If data ports of the same type are desired to be connected (e.g. every data port A connected to every other data port A), the situation is further complicated. As a result, providing the crossbar for a larger number of data ports is challenging, particularly if the space allocated for the crossbar is small. Accordingly, what is needed is a mechanism for scaling the crossbar to larger numbers of data ports.
A system that routes data is described. The system includes a first group of data ports of one or more first elements of an integrated circuit and a second group of data ports of one or more second elements of the integrated circuit. The system also includes a point-to-point connection between a first data port of the first group of data ports to a second data port of the second group of data ports. In addition, the system includes, for the first data port, a distinct crossbar connected to every data port of the second group of data ports. In some embodiments, the distinct crossbar for the first data port includes a pipeline having multiple pipeline states that connect to each data port of the second group of data ports.
A method includes providing data from a first data port to a second data port. The first data port is one of a first group of data ports for one or more first elements of an integrated circuit. The second data port is one of a second group of data ports of one or more second elements of the integrated circuit. The data is provided via a distinct crossbar connected from the first data port to every data port of the second group of data ports. The method also includes providing a valid signal from the first data port to the second data port. The valid signal is provided via a point-to-point connection between the first data port and the second data port. The point-to-point connection is one of a plurality of point-to-point connections between each of the first group of data ports and each of the second group of data ports. The valid signal and the data are coincident at the second data port.
A method for providing a system that routes data is described. The method includes providing a first group of data ports of one or more first elements of an integrated circuit. The method also includes providing a second group of data ports of one or more second elements of the integrated circuit. A point-to-point connection is provided. The point-to-point connection is between a first data port of the first group of data ports and a second data port of the second group of data ports. For the first data port, a distinct crossbar is provided. The distinct crossbar is connected to every data port of the second group of data ports.
Routing system 110 has data ports 140-0, 140-1, 140-2, 140-3, 140-4, 140-5, 140-6, and 140-7 (collectively or generically data port(s) 140) and data ports 150-0, 150-1, 150-2, and 150-3 (collectively or generically data port(s) 150) corresponding to elements 160 and 170, respectively. Although depicted as a single line, data ports 150 and 160 generally each include multiple wires. Routing system 110 allows for transfer of data from each data port 140, and thus each element 160, to all data ports 150, and thus all elements 170. Similarly, routing system 110 allows for transfer of data from each data port 150, and thus element 170, to all data ports 140, and thus all elements 160. Routing system 110 may be viewed as functioning as a crossbar. Thus, routing system 110 may be termed a crossbar. However, instead of the mesh connections of a crossbar, routing system 110 includes point-to-point connections 120 and distinct crossbars 130.
Point-to-point connections 120 provide a point-to-point connection from each data port 150 to every data port 140. Similarly, point-to-point connections 120 provide a point-to-point connection from each data port 140 to every data port 150. For example,
Routing system 110 also includes distinct crossbars 130. Distinct crossbars 130 allow for data transfer between data ports 140 and 150, and thus between elements 160 and 170. Although termed “crossbars”, distinct crossbars 130 need not be implemented as a crossbar. Instead, distinct crossbars 130 have a bus structure. In some embodiments, distinct crossbars 130 utilize individual pipelines between each (source) data port 150 and every (destination) data port 140, and vice versa. For example,
Routing system 110 allows for the exchange of data between elements 160 and 170 via point-to-point connections 120 and distinct crossbars 130. For example, to transfer data from element 170 (e.g. a cache) via data port 150-0, routing system 110 provides a valid signal on point-to-point connections 120-0 for each data port 140 that will receive data. Further, data from element 170 is transferred from port 150-0 over distinct crossbar 130-0. Valid signals provided via point-to-point connections 120-0 may be timed such that elements 160 are notified to pull data from the corresponding port 140-0, 140-1, 140-2, 140-3, 140-4, 150-5, 140-6 or 140-7 at the appropriate time. In some embodiments, the valid signal provided via point-to-point connections 120-0 to a particular port 140 is coincident with provided via distinct crossbar 130-0 at that particular port 140. For example, suppose data from port 150-0 is transferred to data port 140-3 and to data port 140-4. This data is present at data ports 140-3 and 140-4 at times t1 and t2, which may correspond to clock cycle 3 and clock cycle 4 from data being sent from data port 150-0. In some embodiments, valid signals from data port 150-0 provided via point-to-point connections 120-0 are also present at data ports 140-3 and 140-4 at times t3 and t4, respectively. Data may then be pulled, or otherwise received, from data ports 140-3 and 140-4. In some embodiments, a credit system is also used by source data ports 140 and/or 150 to determine whether data may be sent to a particular destination data port 150 and/or 140, respectively. In such embodiments, the destination port provides a credit release signal, indicating that data may be received on the corresponding data port. In the example above, destination data ports 140-3 and 140-4 each provide a credit release signal to data port 150-0 in response to data being pulled from data ports 140-3 and 140-4, respectively, by the corresponding elements 160. In some embodiments, the credit is based on a round trip time added to an overhead for the source data port and the destination data port. Thus, the credits corresponding to data port 140-3 may differ from the credits corresponding to data port 140-4 for port 150-0. Thus, routing system 110 may route data between elements 160 and 170. Further, routing system 110 may be extended to more than two types of data ports.
Using routing system 110, system 100 may be capable of routing data between the desired elements 160 and 170, such as processing engines and caches. Moreover, system 100 may be more readily scaled to larger numbers of elements 160 and/or 170. Routing system 110 uses point-to-point connections 120 in combination with distinct crossbars 130 having a bus structure (e.g. pipelines). Because routing system 110 uses distinct crossbars 130 in combination with point-to-point connections 120, routing system 110 includes one distinct crossbar 130 per data port 140 and 150. Thus, the number of wires utilized for routing system 110 increases linearly with the number of data ports. Stated differently, the number of wires routed is (bus width)* [total number of tracks] = (bus width)*[∑(number of data ports)]. For example, suppose there are three types of data ports (A, B, C) which are desired to be connected in a manner analogous to routing system 110. This is analogous to the example described above with respect to a full mesh. The number of wires routed in the direct crossbar 130 routing system 110 is (bus width)* [number of A data ports + number of B data ports + the number of C data ports)]. If the number of A data ports is 8, the number of B data ports is 8, the number of C data ports is 2, and the bus width is 500 wires, the number of wires routed is 9,000. The inclusion of the point-to-point connections between data ports does not markedly change the number of wires required. Thus, routing system 110 scales much more readily with the number of ports. Further, routing system 110 may occupy a smaller amount of space as routing system 110 is scaled to larger numbers of data ports. Consequently, routing system may 110 may significantly improve fabrication, scalability, and performance, particularly for systems 100 using large number(s) of elements 160 and/or 170.
Routing system 210 also includes ports 280 corresponding to elements 290 of system 200. For example, elements 290 may be other processors, such as systems on a chip (SOCs), memories, bridges, or other components of system 200 desired to be connected with elements 260 and/or 270 via routing system 210. Thus, connection to three types of elements, 260, 270 and 290 is provided via routing system 210. Point-to-point connections 220 and distinct crossbars 230 also include structures for ports 280. For example, point-to-point connections 220 include additional connections to ports 280. Each distinct crossbar 230 provided for ports 240 and 250 may include additional pipeline stages for data transfer to ports 280. Further, distinct crossbars 230 include additional distinct crossbars for ports 280. Thus, routing system 110 may be expanded to additional ports and/or additional types of elements for which connection is desired.
System 200 shares the benefits of system 100. Routing system 210 is capable of routing data between the desired elements 260, 270, and 290. Because routing system 210 uses distinct crossbars 230 in combination with point-to-point connections 220, routing system 210 includes one distinct crossbar 230 per data port 240, 250, and/or 280. The complexity of routing system 210 increases linearly with the number of data ports. Thus, routing system 210 scales much more readily with the number of data ports. Moreover, routing system 210 may occupy less space. Consequently, routing system may 210 may significantly improve fabrication and performance, particularly for systems 200 using large number(s) of elements 260, 270 and/or 290.
Routing system 310 includes distinct crossbars 330 and point-to-point connections (not shown for clarity). Also shown are source data ports 350-0, 350-1, 350-2, 350-3, 350-4, 350-5, 350-6, and 350-7 (collectively or generically port(s) 350), destination data ports 340-0, 340-1, 340-2, 340-3, 340-4, 340-5, 340-6, and 340-7 (collectively or generically port(s) 340), and data port 380. The arrows for ports 340, 350, and 380 indicate that information may flow in either direction for a particular port 340, 350, and 380.
Referring to
Referring to
System 300 shares the benefits of system(s) 100 and/or 200. Routing system 310 is capable of routing data between the desired elements using distinct pipelines, such as data pipelines 330-0 and 330-7, as distinct crossbars. Routing system 310 uses pipelines 330-0 and 330-7 (i.e. distinct crossbars) in combination with point-to-point connections (not shown in
Routing system 410 includes distinct crossbars such as pipelines (not shown in
System 400 shares the benefits of system(s) 100, 200 and/or 300. Routing system 410 is capable of routing data between the desired elements using distinct pipelines, or distinct crossbars, and point-to-point connections 420. The complexity of routing system 410 increases linearly with the number of data ports. Thus, routing system 410 scales much more readily with the number of data ports. Moreover, routing system 410 may occupy less space. Consequently, routing system may 410 may significantly improve fabrication and performance.
At 504, valid signal(s) are provided from the source data port(s) to each of the destination data ports that receive data. The valid signal(s) of 504 are provided via point-to-point connections between the source data ports and the destination data ports. In some embodiments, 502 and 504 are performed such that the valid signal and the data are coincident at particular destination data ports receiving data. As a result, destination data ports may be notified of the presence of data that should be pulled and provided to the corresponding elements. Data may be pulled from the appropriate pipeline stage(s). In response to the data being pulled, credit release signal(s) may be sent from the destination port(s) to the source data port(s) via point-to-point connections. Credit release signal(s) are received at the source port(s) from the destination port(s), at 506. Thus, the source port(s) may be notified of the destination ports’ ability to receive additional data.
For example, method 500 may be used in connection with system 300 of
Using method 500, data may be routed between the desired elements using distinct pipelines and point-to-point connections. A routing system having a complexity that increases linearly with the number of data ports may be utilized. Thus, the benefits of such a routing system may be achieved.
Point-to-point connections are provided between the each of the first group of data ports and every data port of the second group of data ports, at 604. In some embodiments, the direct connection may be capable of transmitting limited information, such as a valid bit and a credit release signal.
A distinct crossbar is provided for each of the data ports, at 606. The distinct crossbar provides data from each of the first group of data ports to every data port of the second group of data ports. For example, a pipeline from a data port of the first group of data ports including pipeline stages for each of the second group of data ports may be provided at 606. In some embodiments, 606 may be repeated to provide distinct crossbars (e.g. pipelines) for each of the second group of data ports. This may allow for transfer of data from the second group of data ports to the first group of data ports.
For example, method 600 may be used in connection with system 300 of
Using method 600, a system for routing data between the desired elements may be fabricated. The routing system uses distinct pipelines and point-to-point connections. A routing system having a complexity that increases linearly with the number of data ports may be provided. Thus, the benefits of such a routing system may be achieved.
Although the foregoing embodiments have been described in some detail for purposes of clarity of understanding, the disclosure is not limited to the details provided. There are many alternative ways of implementing the disclosure. The disclosed embodiments are illustrative and not restrictive.