Examples of the present disclosure generally relate to electronic circuits and, in particular, to distributed multi-die routing in a multi-chip module (MCM).
Programmable integrated circuits (ICs) are often used to implement digital logic operations according to user configurable input. Example programmable ICs include complex programmable logic devices (CPLDs) and field programmable gate arrays (FPGAs). CPLDs often include several function blocks that are based on a programmable logic array (PLA) architecture with sum-of-products logic. A configurable interconnect matrix transmits signals between the function blocks.
One type of FPGA includes an array of programmable tiles. The programmable tiles comprise various types of logic blocks, which can include, for example, input/output blocks (IOBs), configurable logic blocks (CLBs), dedicated random access memory blocks (BRAM), multipliers, digital signal processing blocks (DSPs), processors, clock managers, delay lock loops (DLLs), bus or network interfaces such as Peripheral Component Interconnect Express (PCIe) and Ethernet and so forth. Each programmable tile typically includes both programmable interconnect and programmable logic. The programmable interconnect typically includes a large number of interconnect lines of varying lengths interconnected by programmable interconnect points (PIPs). The programmable logic implements the logic of a user design using programmable elements that can include, for example, function generators, registers, arithmetic logic, and so forth.
Programmable ICs can be part of a larger multi-chip module (MCM) package. For example, an MCM package can include multiple IC die stacked on an interposer or multiple IC die stacked on one another. In such an MCM package, a programmable IC can include circuitry to facilitate die-to-die connectivity. However, such die-to-die connection circuits are often scarce resources, leading to routing congestion when signals compete for resources on the same areas of the programmable IC die.
Techniques for distributed multi-die routing in a multi-chip module (MOM) are described. In an example, a programmable integrated circuit (IC) includes external contacts configured to interface with a substrate and a plurality of configurable logic elements (CLEs) distributed across a programmable fabric. The programmable IC further includes interconnect circuits disposed between the plurality of CLEs and the external contacts. A plurality of the interconnect circuits is disposed in the plurality of CLEs.
In another example, a multi-chip module includes a substrate and a plurality of integrated circuit (IC) dies disposed on the substrate, the plurality of IC dies including a programmable IC die. The programmable IC die includes external contacts configured to interface with the substrate and a plurality of configurable logic elements (CLEs) distributed across a programmable fabric. The programmable IC die further includes interconnect circuits disposed between the plurality of CLEs and the external contacts. A plurality of the interconnect circuits is disposed in the plurality of CLEs. The multi-chip module further includes conductors disposed on the substrate electrically coupled to the interconnect circuits through the external contacts.
In another example, a method of communication in a multi-chip module having a programmable integrated circuit (IC) includes configuring first multiplexing logic in the programmable IC to couple a configurable logic element (CLE) slice to an external interconnect circuit in a CLE; and configuring second multiplexing logic in the external interconnect circuit to couple the CLE slice to a transmitter in the external interconnect circuit, the transmitter being coupled to an external contact of the programmable IC.
These and other aspects may be understood with reference to the following detailed description.
So that the manner in which the above recited features can be understood in detail, a more particular description, briefly summarized above, may be had by reference to example implementations, some of which are illustrated in the appended drawings. It is to be noted, however, that the appended drawings illustrate only typical example implementations and are therefore not to be considered limiting of its scope.
To facilitate understanding, identical reference numerals have been used, where possible, to designate identical elements that are common to the figures. It is contemplated that elements of one example may be beneficially incorporated in other examples.
Various features are described hereinafter with reference to the figures. It should be noted that the figures may or may not be drawn to scale and that the elements of similar structures or functions are represented by like reference numerals throughout the figures. It should be noted that the figures are only intended to facilitate the description of the features. They are not intended as an exhaustive description of the claimed invention or as a limitation on the scope of the claimed invention. In addition, an illustrated embodiment need not have all the aspects or advantages shown. An aspect or an advantage described in conjunction with a particular embodiment is not necessarily limited to that embodiment and can be practiced in any other embodiments even if not so illustrated, or if not so explicitly described.
Techniques for distributed multi-die routing in a multi-chip module (MOM) are described. In an example, a programmable integrated circuit (IC) includes external contacts configured to interface with a substrate, such as solder bumps configured to interface an interposer or package substrate in an MOM. The programmable IC includes a plurality of configurable logic elements (CLEs) distributed across a programmable fabric. The programmable IC further includes interconnect circuits disposed between the plurality of CLEs and the external contacts, where the interconnect circuits are disposed in the CLEs. Since such interconnect circuits are coupled to the external contacts, such interconnect circuits are referred to herein as “external interconnect circuits.” Such external interconnect circuits are distinguishable from the general purpose interconnect in the programmable fabric of the programmable IC. The external interconnect circuits can be used to enable efficient die-to-die connectivity within an MOM. The external interconnect circuits can also be used to enable efficient intra-die connectivity within the programmable IC without the need to use the general purpose interconnect. These and other aspects are described below with respect to the following figures.
In some FPGAs, each programmable tile can include at least one programmable interconnect element (“INT”) 111 having connections to input and output terminals 120 of a programmable logic element within the same tile, as shown by examples included at the top of
In an example implementation, a CLB 102 can include a configurable logic element (“CLE”) 112 that can be programmed to implement user logic plus a single programmable interconnect element (“INT”) 111. A BRAM 103 can include a BRAM logic element (“BRL”) 113 in addition to one or more programmable interconnect elements. Typically, the number of interconnect elements included in a tile depends on the height of the tile. In the pictured example, a BRAM tile has the same height as five CLBs, but other numbers (e.g., four) can also be used. A DSP tile 106 can include a DSP logic element (“DSPL”) 114 in addition to an appropriate number of programmable interconnect elements. An 10B 104 can include, for example, two instances of an input/output logic element (“IOL”) 115 in addition to one instance of the programmable interconnect element 111. As will be clear to those of skill in the art, the actual I/O pads connected, for example, to the I/O logic element 115 typically are not confined to the area of the input/output logic element 115.
In the pictured example, a horizontal area near the center of the die (shown in
Some FPGAs utilizing the architecture illustrated in
Note that
The configuration memory 152 can be loaded with a configuration bitstream for programming (“configuring”) the programmable fabric 150. For example, a configuration bitstream can be loaded into the configuration memory 152 to configure the CLEs 112 of the programmable fabric 150, as described herein.
The L mux(s) 164 provide a layer of multiplexing between various connections to facilitate high-speed connections that do not have to traverse local interconnect to transfer from one route to the next. That is, a given connection can traverse one route and hop through one L mux to get to the next route, instead of having to go through multiple multiplexer stages in between.
The CLE slices 160A and 160B can be coupled to the external INT 156. The L mux(s) 164 can be controlled to electrically couple a given CLE slice of the CLE slices 160 to a given external contact of the external contacts 158, which can be coupled to another IC die (inter-die connectivity) or to another one of the external contracts 158 (intra-die connectivity) using an external conductor. In an example, the programmable fabric 150 can include multiplexer logic referred to herein as B mux(s) 162. The B mux(s) 162 can be coupled between the external INT 156 and one or more other CLEs 112. The B mux(s) 162 can be used to increase flexibility, allowing more CLE resources to utilize the external INT 156 for inter-die or intra-die connectivity. The L mux(s) 164, the R mux(s) 170, and the B mux(s) 162 can be controlled using bits of the configuration memory 152 and/or using signals from the CLEs 112.
In one approach, a programmable fabric of an FPGA can include dedicated columns of external interconnect circuits to facilitate die-to-die connectivity. One problem with such an approach is that since die-to-die connections are scarce resources, routing congestion can develop in areas surrounding the external interconnect columns. In examples described herein, the external interconnect circuits 156 are implemented within CLEs 112 and can be distributed throughout the programmable fabric 150. This reduces local routing congestion by distributing the inter-die and/or intra-die connections over a much wider area than can be achieved in the dedicated column approach. Further, the external interconnect circuits 156 include dedicated circuitry to facilitate the inter-die and/or intra-die connections, including the driver(s) 166, the receiver(s) 168, and interconnect resources (e.g., the L mux(s) 164 and the R mux(s) 170). As part of the CLEs 112, the external interconnect circuits 156 can leverage the CLEs' interconnect and register resources at reduced cost than in the dedicated column approach.
In the example, the CLE 112 includes a B mux 162. In other examples, the B mux 162 can be omitted. An input of the B mux 162 is coupled to an output of the CLE slice 160. Other input(s) of the B mux 162 can be coupled to terminal(s) 212, which in turn can be coupled to other CLE(s) 112. An output of the B mux 162 is coupled to an input of an L mux 164. If the B mux 162 is omitted, then the output of the CLE slice 160 can be directly coupled to an input of the L mux 164.
Other input(s) of the L mux 164 can be coupled to terminal(s) 214, which in turn can be coupled to other CLE(s) 112 or other CLE slices 160. Another input of the L mux 164 can be coupled to an output of an R mux 170R. An output of the L mux 164 can be coupled to an input of an R mux 170T. An output of the R mux 170T can be coupled to an input of a driver 166. An output of the driver 166 can be coupled to an external contact 158. The R mux 170T can include other output(s) coupled to other driver(s) of the external interconnect circuit 156 (e.g., redundant drivers).
An input of a receiver 168 can be coupled to the external contact 158. An output of the receiver 168 can be coupled to an input of the R mux 170R. An output of the R mux 170R can be coupled to an input of the L mux 164 and to an input of the multiplexer logic 208 in the CLE 112. The R mux 170R can include other input(s) coupled to other receiver(s) of the external interconnect circuit 156 (e.g., redundant receivers).
In operation, the B mux 162 (if present) and the L mux 164 can be controlled to couple the CLE slice 160 to the external contact 158 through the driver 166. Alternatively, the B mux 162 and the L mux 164 can be controlled to couple another CLE coupled to a terminal 212 to the external contact 158 through the driver 166. Alternatively, the L mux 164 can be controlled to couple another CLE or another CLE slice to the external contact 158 through the driver 166. The CLE slice 160 can also be coupled to the external contact 158 through the receiver 168. The driver 166 and the receiver 168 can each be three-state devices having a high-impedance state so that only one of the driver 166 and the receiver 168 is electrically coupled to the external contact 158 at a time.
In the example shown in
In the example, the IC die 302 includes three instances of an external interconnect circuit 156A, 156B, and 156C. The external interconnect circuit 156C is coupled to the IC die 304 through a conductor 314 of the substrate 306. The external interconnect circuit 156C can be used for inter-die communication. The external interconnect circuits 156A and 156B are coupled to each other through a conductor 312 of the substrate 306. The external interconnect circuits 156A and 156B can be used for intra-die communication. In a practical MOM, the FPGA die can include any number of external interconnect circuits, each of which can be coupled to conductors of the substrate 306 for inter-die communication and/or intra-die communication. Although the conductors 312 and 314 of the substrate 306 are shown on a single layer, the substrate 306 can include any number of conductive layers having conductors coupled to external interconnect circuits of an FPGA die. Moreover, in some examples, external interconnect circuits described herein can be used in a single FPGA package to implement only intra-die communication.
The output of the receiver 508N is coupled to inputs of the L muxes 502S, 502E, and 502W. The output of the receiver 508S is coupled to inputs of the L muxes 502N, 502E, and 502W. The output of the receiver 508E is coupled to inputs of the L muxes 502N, 502S, and 502W. The output of the receiver 508W is coupled to inputs of the L muxes 502N, 502S, and 502E. The L mux 502N includes one or more additional inputs 510N; the L mux 502S includes one or more additional inputs 510S; the L mux 502E includes one or more additional inputs 510E; and the L mux 502W includes one or more additional inputs 510W. The additional inputs 510 can be coupled to outputs of CLE slices either in one CLE or across different CLEs. In another example, the additional inputs 510 can be coupled to B muxes disposed between CLE slices and the external interconnect circuitry. In the example shown in
While the foregoing is directed to specific examples, other and further examples may be devised without departing from the basic scope thereof, and the scope thereof is determined by the claims that follow.
Number | Name | Date | Kind |
---|---|---|---|
5162260 | Leibovitz et al. | Nov 1992 | A |
5225719 | Agrawal | Jul 1993 | A |
5270261 | Bertin et al. | Dec 1993 | A |
5323060 | Fogal et al. | Jun 1994 | A |
5327327 | Frew et al. | Jul 1994 | A |
5380681 | Hsu | Jan 1995 | A |
5424589 | Dobbelaere | Jun 1995 | A |
5481133 | Hsu | Jan 1996 | A |
5502333 | Bertin et al. | Mar 1996 | A |
5521122 | Kuramochi | May 1996 | A |
5552633 | Sharma | Sep 1996 | A |
5561622 | Bertin et al. | Oct 1996 | A |
5568574 | Tanguay, Jr. et al. | Oct 1996 | A |
5629563 | Takiar et al. | May 1997 | A |
5633530 | Hsu | May 1997 | A |
5652811 | Cook et al. | Jul 1997 | A |
5682107 | Tavana et al. | Oct 1997 | A |
5715197 | Nance et al. | Feb 1998 | A |
5804004 | Tuckerman et al. | Sep 1998 | A |
5880598 | Duong | Mar 1999 | A |
5905639 | Warren | May 1999 | A |
5914616 | Young | Jun 1999 | A |
5990501 | Hiyoshi et al. | Nov 1999 | A |
6099583 | Nag | Aug 2000 | A |
6114221 | Tonti et al. | Sep 2000 | A |
6191613 | Schultz et al. | Feb 2001 | B1 |
6255736 | Kaneko | Jul 2001 | B1 |
6271059 | Bertin et al. | Aug 2001 | B1 |
6337579 | Mochida | Jan 2002 | B1 |
6368930 | Enquist | Apr 2002 | B1 |
6396302 | New et al. | May 2002 | B2 |
6410431 | Bertin et al. | Jun 2002 | B2 |
6444560 | Pogge et al. | Sep 2002 | B1 |
6446249 | Wang | Sep 2002 | B1 |
6501663 | Pan | Dec 2002 | B1 |
6570404 | Norman | May 2003 | B1 |
6580164 | Ohie | Jun 2003 | B1 |
6849951 | Trimberger et al. | Feb 2005 | B1 |
6917219 | New | Jul 2005 | B2 |
7068072 | New et al. | Jun 2006 | B2 |
7339400 | Walstrum, Jr. | Mar 2008 | B1 |
7345508 | Jang | Mar 2008 | B1 |
7484027 | Dahlin | Jan 2009 | B1 |
7518396 | Kondapalli | Apr 2009 | B1 |
7525340 | Shumarayev | Apr 2009 | B2 |
8415975 | Birsan | Apr 2013 | B1 |
8611159 | Sasaki | Dec 2013 | B1 |
8674235 | Hossain | Mar 2014 | B2 |
8786308 | Loh | Jul 2014 | B1 |
20010030555 | Witting et al. | Oct 2001 | A1 |
20020008309 | Akiyama | Jan 2002 | A1 |
20020064906 | Enquist | May 2002 | A1 |
20030052712 | Comer | Mar 2003 | A1 |
20040238936 | Rumer | Dec 2004 | A1 |
20050275750 | Akram | Dec 2005 | A1 |
20060125084 | Fazzio | Jun 2006 | A1 |
20070170554 | Camacho | Jul 2007 | A1 |
20110316163 | Do | Dec 2011 | A1 |
20120191921 | Shaeffer | Jul 2012 | A1 |
20120305303 | Hossain | Dec 2012 | A1 |
20160300815 | Kim | Oct 2016 | A1 |
Entry |
---|
Kaustav Banerjee, Shukri J. Souri, Pawan Kapur and Krishna C. Saraswat, “3-D ICs: A Novel Chip Design for Improving Bannerjee, Kaustav et al, 3-D ICs: A Novel Chip Design for Improving Deep-Submicrometer Interconnect Performance and Systems-on-Chip Integration,” Proc. of the IEEE, May 2001, pp. 602-633, vol. 89., No. 5, IEEE, Piscataway, New Jersey, USA. |
Romanelli, Alexi, “Intel Stacks Flash Deck in its Favor,” Electronic News, Apr. 10, 2003, pp. 1-2, available from Reed Electronics Group, http://www.e-insite.net/electronicnews/index.asp?layout=article&articleid- =CA291318. |
Xilinx, “Virtex-II Pro Platform FPGA Handbook,” UG012 (v2.0), Oct. 14, 2002, pp. 27-71, Xilinx, Inc., San Jose, California, USA. |