This application is related to the following co-pending U.S. patent applications: U.S. patent application entitled “Interface Device Reset,” by Dai D. Tran, et al., U.S. patent application entitled “Configurable Interface” by Paige A. Kolze, et al., U.S. patent application entitled “Hard Macro-to-User Logic Interface,” by Laurent Stadler, and U.S. patent application entitled “Reconfiguration of a Hard Macro via Configuration Registers,” by Jerry A. Case, each of which was filed on the same day as the present application and each of which is assigned to the assignee of the present application. The entire contents of each of the above-referenced co-pending patent applications are incorporated herein by reference for all purposes.
One or more aspects of the invention relate generally to integrated circuits, and, more particularly, to lane configuration of an interface device of an integrated circuit.
Programmable logic devices (“PLDs”) are a well-known type of integrated circuit that can be programmed to perform specified logic functions. One type of PLD, the field programmable gate array (“FPGA”), typically includes an array of programmable tiles. These programmable tiles can include, for example, input/output blocks (“IOBs”), configurable logic blocks (“CLBs”), dedicated random access memory blocks (“BRAMs”), multipliers, digital signal processing blocks (“DSPs”), processors, clock managers, delay lock loops (“DLLs”), and so forth. Notably, as used herein, “include” and “including” mean including without limitation.
One such FPGA is the Xilinx Virtex™ FPGA available from Xilinx, Inc., 2100 Logic Drive, San Jose, Calif. 95124. Another type of PLD is the Complex Programmable Logic Device (“CPLD”). A CPLD includes two or more “function blocks” connected together and to input/output (“I/O”) resources by an interconnect switch matrix. Each function block of the CPLD includes a two-level AND/OR structure similar to those used in Programmable Logic Arrays (“PLAs”) and Programmable Array Logic (“PAL”) devices. Other PLDs are programmed by applying a processing layer, such as a metal layer, that programmably interconnects the various elements on the device. These PLDs are known as mask programmable devices. PLDs can also be implemented in other ways, for example, using fuse or antifuse technology. The terms “PLD” and “programmable logic device” include but are not limited to these exemplary devices, as well as encompassing devices that are only partially programmable.
For purposes of clarity, FPGAs are described below though other types of PLDs may be used. FPGAs may include one or more embedded microprocessors. For example, a microprocessor may be located in an area reserved for it, generally referred to as a “processor block.”
Heretofore, performance of a design instantiated in programmable logic of an FPGA (“FPGA fabric”) using a Peripheral Component Interconnect (“PCI”) Express (“PCIe”) internal to such FPGA was limited to performance of a PCIe design for instantiation in FPGA fabric (“soft core”). Additional details regarding examples of PCIe soft cores are available from Xilinx, Inc. of San Jose, Calif. and are described in “PCI Express PIPE Endpoint LogiCORE Product Specification,” DS321 (v1.1), Apr. 11, 2005 and in “PCI Express Endpoint Cores v3.4 Product Specification,” DS506, Feb. 15, 2007, both available from Xilinx, Inc.
PCIe soft cores have been implemented as an “Endpoint” architecture. Target applications for such Endpoint architecture include: test equipment, consumer graphics boards, medical imaging equipment, data communication networks, telecommunication networks, broadband deployments, cross-connects, workstation and mainframe backbones, network interface cards, chip-to-chip and backplane interconnect, crossbar switches, wireless base stations, high bandwidth digital video, and high bandwidth server applications, among other known add-in cards, host bus adapters, and other known applications.
Accordingly, it would be desirable and useful to provide a PCIe Endpoint internal to an FPGA having enhanced performance over that of a PCIe soft core instantiated in FPGA fabric.
One or more aspects of the invention generally relate to integrated circuits, and, more particularly, to lane configuration of an interface device of an integrated circuit.
An aspect of the invention is an integrated circuit including a core for tiling a portion thereof with a first version of the core and a second version of the core. The core is an application specific circuit version of an interface device. The first version and the second version in combination have a sharable interface. Each of the first version and the second version has N lanes. The first version is a primary version and the second version is a secondary version responsive to a shared interface mode. The N lanes of the second version being combined with the N lanes of the first version via the sharable interface for providing 2-by-N lanes of input/output to the first version.
Another aspect of the invention is a method for lane configuration. A portion of an integrated circuit is tiled by repeated application of a core having N lanes, where the core is an application specific circuit version of an interface device. The core is configurable for either a shared interface mode or a non-shared interface mode. At least two instances of the core are placed in the shared interface mode to provide a shared interface capable of at least 2-by-N lanes.
Yet another aspect of the invention is a programmable logic device including M instances of a core tiled in a column of the programmable logic device. The M instances of the core are associated with a Peripheral Component Interconnect Express (“PCIe”) interface. The M instances of the core are capable of being in either a shared interface mode or a non-shared interface mode. Each of the M instances of the core has N lanes. In the shared interface mode, the N lanes of each of the M instances of the core capable of being combined provide an M-by-N version of the PCIe interface. Each of the M instances of the core has dedicated primary and secondary sets of input and output pins for providing shared physical layer-side and transaction layer-side buses. The dedicated primary and secondary sets of input and output pins are located for abutting primary input to secondary output and primary output to secondary input responsive to tiling of the M instances of the core.
Accompanying drawing(s) show exemplary embodiment(s) in accordance with one or more aspects of the invention; however, the accompanying drawing(s) should not be taken to limit the invention to the embodiment(s) shown, but are for explanation and understanding only.
In the following description, numerous specific details are set forth to provide a more thorough description of the specific embodiments of the invention. It should be apparent, however, to one skilled in the art, that the invention may be practiced without all the specific details given below. In other instances, well known features have not been described in detail so as not to obscure the invention. For ease of illustration, the same number labels are used in different diagrams to refer to the same items; however, in alternative embodiments the items may be different.
In some FPGAs, each programmable tile includes a programmable interconnect element (“INT”) 111 having standardized connections to and from a corresponding interconnect element 111 in each adjacent tile. Therefore, the programmable interconnect elements 111 taken together implement the programmable interconnect structure for the illustrated FPGA. Each programmable interconnect element 111 also includes the connections to and from any other programmable logic element(s) within the same tile, as shown by the examples included at the right side of
For example, a CLB 102 can include a configurable logic element (“CLE”) 112 that can be programmed to implement user logic plus a single programmable interconnect element 111. A BRAM 103 can include a BRAM logic element (“BRL”) 113 in addition to one or more programmable interconnect elements 111. Typically, the number of interconnect elements included in a tile depends on the height of the tile. In the pictured embodiment, a BRAM tile has the same height as four CLBs, but other numbers (e.g., five) can also be used. A DSP tile 106 can include a DSP logic element (“DSPL”) 114 in addition to an appropriate number of programmable interconnect elements 111. An IOB 104 can include, for example, two instances of an input/output logic element (“IOL”) 115 in addition to one instance of the programmable interconnect element 111. As will be clear to those of skill in the art, the actual I/O pads connected, for example, to the I/O logic element 115 are manufactured using metal layered above the various illustrated logic blocks, and typically are not confined to the area of the I/O logic element 115.
In the pictured embodiment, a columnar area near the center of the die (shown shaded in
Some FPGAs utilizing the architecture illustrated in
Note that
Within PCIe core 210, TLM 301 is coupled to DLM 303 for bidirectional communication, and DLM 303 is coupled to PLM 305 for bidirectional communication. Additionally, each of TLM 301, DLM 303, and PLM 305 is coupled to CMM 307 for bidirectional communication. Reset block 309 is coupled to TLM 301, DLM 303, PLM 305, CMM 307, and management block 302, though not illustratively shown in
PLM 305 is coupled to Root Complex 321 via PCIe interface 318. Additionally, PLM 305 may be coupled to system resources 323 for receiving a clock signal. Reset block 309 may be coupled to system resources 323 for receiving reset signaling. Management block 302 may be coupled to system resources 323 for dynamic configuration and status monitoring. Configuration interface 314 may couple host interface 325 to management block 302, and host interface 325 may thus be coupled to CMM 307 via configuration interface 314 and management block 302. User logic 327, which may be instantiated in FPGA fabric, is coupled to TLM 301 via transaction interface 312.
With continuing reference to
Host interface 325 may be an interface to a processor of a processor block 110 of
Root complex 321 includes I/O blocks 401-0 and 401-1; I/O block 401-0 is directly coupled to I/O block 401-2 of Endpoint 322-1. With reference to FPGA 100 of
Having this understanding of a PCIe network 400, and a PCIe hard core 210 of
With continuing reference to
Each PCIe core 210-1 and 210-2 includes four physical layer (“PL”) lanes 501. Physical layer lanes 501 may be coupled to respective MGTs 101, for example, of FPGA 100 of
On a transaction layer side, PCIe cores 210-1 and 210-2 are coupled to a “transaction layer-side bus” 511. Transaction layer-side bus 511 is coupled to memory modules 503 respectively of such cores. Memory modules 503 may be coupled to respective CMMs 307, TLMs 301, DLMs 303, and PLMs 305 of such cores via a known memory busing architecture; such coupling is not illustratively shown in
Each PCIe core 210-1 and 210-2 may include a respective pipeline module 502. One or more pipeline modules 502 need not be used in each application of PCIe cores 210-1 and 210-2, and thus may be optioned in as indicated by dashed lines.
In
Physical layer lanes 501 may operate at a lane clock frequency, which for PCIe may be approximately 250 MHz. PCIe interface 318 thus would operate at a lane clock frequency. Modules 550-1 and 550-2 may operate at a link clock rate or core clock rate, which may be the same or different. However, for PCIe a link clock rate and a core clock rate may both be approximately 250 MHz. Memory modules 503 may operate at either a link clock rate or core clock rate, which for PCIe is approximately 250 MHz, or a user clock rate, such as approximately 250, 125, or 62.5 MHz, for example. Notably, portions of TLM 301 and CMM 307 may include a user clock domain operating at a user clock rate. Transaction interface 312 and configuration interface 314 may likewise be operating at the user clock rate.
In an implementation using a Xilinx FPGA, MGTs 101 of
Thus, modules 550-1 and 550-2 in this example are configured for 64-bit processing, as are memory modules 503. Furthermore, all interfaces on a transaction layer side are configured for a data bit width of 64 bits. Whether a one-lane, two-lane, four-lane, or eight-lane PCIe interface is to be implemented, configuration registers (not shown) of CMMs 307 of PCIe cores 210-1 and 210-2, for example, are set in advance to configure a PCIe interface. Notably, for a sixteen-lane PCIe interface, two groups of two PCIe cores 210-1 and 210-2 may be implemented, such as PCIe cores 201-1 through 201-4 of
Another mode in which PCIe cores 210-1 and 210-2 may be implemented is a non-shared interface mode.
Thus, it should be appreciated that while all 64 bits may be processed at a time through a PCIe core 210-1, for example, this usage may result from buffering bits over multiple clock cycles in a non-shared interface mode or may result from aggregating 64 bits at a time each clock cycle by using shared interface circuitry. When in a shared interface mode, as described above where PCIe core 210-1 is a primary core and PCIe core 210-2 is a secondary core, modules 550-2 and memory module 503 of PCIe core 210-2 and associated interfaces are not used.
PCIe core 210-1 includes sets of pins 811S and 812S. Set of pins 811S is associated with an output port for when PCIe core 210-1 is configured as a secondary core. Set of pins 812S is associated with an input port for when PCIe core 210-1 is configured as a secondary core. Notably, ports associated with sets of pins 811P, 811S, 812P, and 812S are only used when PCIe core 210-1 is operating in a shared interface mode. Moreover, sets of pins 811S and 812S exist but are not used unless PCIe core 210-1 is configured to be in a shared interface mode and is configured to operate as a secondary core, and likewise, sets of pins 811P and 812P exist but are not used unless PCIe core 210-1 is configured to be in a shared interface mode and is configured to operate as a primary core.
Likewise, PCIe core 210-2 includes sets of points 821P, 821S, 822P, and 822S. Sets of pins 821S and 822S correspond to sets of pins 811S and 812S, and are used in a like manner, though for PCIe core 210-2. Furthermore, sets of pins 821P and 822P correspond to respective sets of pins 811P and 812P, and likewise are used as previously described, though for PCIe core 210-2. In other words, if PCIe core 210-2 is configured to be a secondary core in a shared interface mode, then sets of pins 821S and 822S are used, and if PCIe core 210-2 is configured to be used as a primary core in a shared interface mode, then sets of pins 821P and 822P are used.
Continuing the above example of a shared interface using physical layer-side bus 512 and transaction layer-side bus 511 of
Transaction layer-side bus 511, continuing the above example, is a 64-bit wide bus. Accordingly, set of pins 811S is associated with signal lines local to PCIe core 210-1 and may be shared with another PCIe core (not shown). Set of pins 811P is associated with coupling signal lines associated with set of pins 821S of PCIe core 210-2 to memory module 503 of PCIe 210-1.
Thus, it should be appreciated that buses 511 and 512 may be formed by serially coupling signal line traces of PCIe cores, primary (“P”) to secondary (“S”), and vice versa. Additionally, it should be noted that sets of pins 811S, 811P, 812S, and 812P, as well as 821S, 821P, 822S, and 822P, may be located on the periphery of their respective PCIe cores 210-1 and 210-2. By having pins on the periphery, it should be appreciated that sets of corresponding pins may abut one another responsive to tiling of PCIe cores.
Thus, it should be appreciated that M PCIe cores for M an integer equal to or greater than 2, which in this embodiment are hard cores, may have a same circuit layout such that they are essentially physically the same. Such PCIe cores may therefore be tiled to provide a PCIe interface of M by N lanes, where N is the number of physical layer lanes of each PCIe core. Notably, N equal to 4 and M equal to 2, as illustratively shown in
The implementation of M cores that may be operated either in a shared interface mode, or independently, provides design flexibility not available with a single core having a fixed M by N lane interface. This flexibility is enhanced by the flexibility associated with an FPGA, where a user may have different sets of user logic interfacing to such PCIe cores. Furthermore, because the number of lanes for each PCIe core may be a subset of the total number of lanes which may be provided via coupling multiple PCIe cores together, a finer granularity may be achieved in comparison to a single PCIe interface block having a fixed M by N lane protocol. For example, if only one lane were to be used, then a substantial amount of circuitry of such a single PCIe interface block having a fixed M by N block implementation would be wasted, in comparison to having PCIe cores that may be flexibly concatenated as described herein. Furthermore, the effective pin count may be reduced by using the finer granularity of individual PCIe cores whether or not they are used in a shared interface mode.
Notably, PCIe cores need not be physically identical but may have similar designs. Thus, they may have a shared interface on only one side, where such cores have complementary ports as previously described for example with reference to
While the foregoing describes exemplary embodiment(s) in accordance with one or more aspects of the invention, other and further embodiment(s) in accordance with the one or more aspects of the invention may be devised without departing from the scope thereof, which is determined by the claim(s) that follow and equivalents thereof. Claim(s) listing steps do not imply any order of the steps. Trademarks are the property of their respective owners.
Number | Name | Date | Kind |
---|---|---|---|
5703498 | Gould et al. | Dec 1997 | A |
5781756 | Hung | Jul 1998 | A |
5857086 | Horan et al. | Jan 1999 | A |
5892961 | Trimberger | Apr 1999 | A |
6067595 | Lindenstruth | May 2000 | A |
6160418 | Burnham | Dec 2000 | A |
6204687 | Schultz et al. | Mar 2001 | B1 |
6241400 | Melo et al. | Jun 2001 | B1 |
6292021 | Furtek et al. | Sep 2001 | B1 |
6294925 | Chan et al. | Sep 2001 | B1 |
6340897 | Lytle et al. | Jan 2002 | B1 |
6522167 | Ansari et al. | Feb 2003 | B1 |
6792578 | Brown et al. | Sep 2004 | B1 |
6903575 | Davidson et al. | Jun 2005 | B1 |
6915365 | Creta et al. | Jul 2005 | B2 |
6976160 | Yin et al. | Dec 2005 | B1 |
7003423 | Kabani et al. | Feb 2006 | B1 |
7043570 | Fry et al. | May 2006 | B2 |
7099969 | McAfee et al. | Aug 2006 | B2 |
7126372 | Vadi et al. | Oct 2006 | B2 |
7190190 | Camarota et al. | Mar 2007 | B1 |
7200832 | Butt et al. | Apr 2007 | B2 |
7213224 | Vogel et a | May 2007 | B2 |
7274213 | Meyer et al. | Sep 2007 | B1 |
7328335 | Sundararajan et al. | Feb 2008 | B1 |
7353162 | Huang et al. | Apr 2008 | B2 |
7447825 | Chen | Nov 2008 | B2 |
7480757 | Atherton et al. | Jan 2009 | B2 |
20060093147 | Kwon et al. | May 2006 | A1 |
20080276029 | Haraden | Nov 2008 | A1 |