When using most field programmable gate arrays (FPGAs), a designer is presented with a logic capacity for that FPGA. The FPGA contains some fixed number of logic block resources, and a designer is presented with that number. In reality, most of the area of an FPGA is devoted to multiplexors (muxes) for the programmable routing resources to connect logic blocks to each other, and in fact not to the logic blocks themselves. FPGA architects must decide up-front how much routing flexibility (muxing) to add. If they add too little routing flexibility, then FPGA users may not be able to use all the logic blocks that they have been promised. If architects add too much routing flexibility, then the FPGA area grows needlessly.
Various embodiments of a computer aided design (CAD) system, a CAD tool, and a method for computer aided design of a circuit to be implemented in a field programmable gate array (FPGA) are presented herein. Embodiments relate to synthesis and analysis of a design for a circuit, in the context of implementing the design on an FPGA.
One embodiment is a method that is performed by a CAD system. The method includes receiving a high level coding of a design for a circuit to be implemented in a field programmable gate array (FPGA). The method includes performing synthesis on the design, to produce a synthesized design. The method includes generating a routability estimation and a logic usage estimation for the synthesized design. The method includes determining whether the synthesized design is implementable on a specific FPGA, based on the routability estimation, the logic usage estimation, and available resources of the specific FPGA.
One embodiment is a tangible, non-transitory, computer-readable media having instructions thereupon which, when executed by a processor, cause the processor to perform a method. The method embodied in the computer-readable media includes receiving a high level coding of a design for a circuit to be implemented in a field programmable gate array (FPGA). The method includes performing synthesis on the design, to produce a synthesized design. The method includes generating a routability estimation and a logic usage estimation for the synthesized design. The method includes determining and indicating to a user whether the synthesized design is implementable on a specific FPGA, based on the routability estimation, the logic usage estimation, and available resources of the specific FPGA.
One embodiment is a computer aided design (CAD) system. The CAD system includes a memory, to receive a high level coding of a design for a circuit to be implemented in a field programmable gate array (FPGA). The CAD system includes a processor. The processor is to perform synthesis on the design, to produce a synthesized design. The processor is to generate a routability estimation and a logic usage estimation for the synthesized design. The processor is to determine whether the synthesized design is implementable on a specific FPGA, based on the routability estimation, the logic usage estimation, and available resources of the specific FPGA.
Embodiments described herein will be understood more fully from the detailed description given below and from the accompanying drawings of various embodiments of the invention, which, however, should not be taken to limit the invention to the specific embodiments, but are for explanation and understanding only.
In the following description, numerous details are set forth to provide a more thorough explanation of the present invention. It will be apparent, however, to one skilled in the art, that the present invention may be practiced without these specific details. In other instances, well-known structures and devices are shown in block diagram form, rather than in detail, in order to avoid obscuring the present embodiments.
CAD tools and CAD systems described herein present technological solutions to various technological problems. One problem addressed is how to decrease wasted computing time and resources that are spent attempting to implement a design for a circuit in an FPGA, only to find during or at the end of placement and routing (i.e., after both logic synthesis and physical synthesis) that the design does not fit in the available resources of the specific FPGA that was intended for the design implementation. Technological solutions described herein reduce such wasted computing time and resources, by providing earlier, more timely information about design implementation and relative fit to available resources of an FPGA, and thereby increase computing efficiency of CAD tools and CAD systems. One problem addressed is how to more efficiently use available resources of an FPGA. Technological solutions described herein provide information to system and user about use of available resources of an FPGA relative to a design implementation, and thereby support more efficient use of available resources of the FPGA.
One embodiment is a method of offering a logic capacity to an FPGA user that can dynamically change based on design properties determined after synthesis. The method includes presenting an FPGA logic capacity range to a user, and generating an early routing estimation related to a synthesized user design based on the properties of their netlist, where the routing estimation gives a user access to a different amount of logic. This embodiment and further aspects and embodiments are described below, with various features and tool capabilities that can be in various combinations in various further embodiments including computer aided design (CAD) systems, methods practiced by CAD systems, and tangible, computer-readable media that has instructions for a processor to practice a method.
Embodiments described herein include a technique that allows FPGA architects to devote less area to FPGA routing, while still allowing FPGA users to use all of the logic capacity that they have been promised. In one embodiment, users are initially presented a range of logic capacities of an FPGA. In one embodiment, this represents the minimum and maximum number of logic blocks that they can expect to fit on to an FPGA. The actual logic capacity that is offered to a user is determined based on the synthesized design.
In one embodiment, an equation based on pin density, a Rent parameter, and logic density (e.g., an equation quantifying Rent's rule) is used to calculate how many logic blocks can fit on to the FPGA. For easy-to-route designs, the user will be presented with the maximum number of logic blocks from the range. For hard-to-route designs, the user will be presented with the minimum number of logic blocks from the range.
If a user's design exceeds the dynamically determined logic capacity, they can either reduce the amount of logic in their design to reduce the logic used or reduce the connectivity between logic blocks, which has the effect of reducing the value of the Rent parameter, and will increase the logic capacity of the device. In terms of an equation based on Rent's rule that is used in various embodiments to determine routability, pin density itself may not affect Rent's parameter. There could be two circuits with identical numbers of logic blocks and Rent parameter, but one has higher pin density than the other. For example, a design of logic blocks that only connect to nearest neighbors will have a Rent parameter around 0.5, regardless of the number of connections to the nearest neighbor. This and further considerations can be accounted for in various embodiments through the use of additional parameters, for example a Rent's parameter scaling value based on pin density and nearest neighbor connections versus external block connections of a block, or a readily developed equation that recognizes the difference between nearest neighbor connections and connections to external blocks, and applies Rent's rule and a Rent parameter accordingly.
In some embodiments, users can also change compilation synthesis settings to target lower area and better routability. Alternatively, physical synthesis techniques can be used to automatically target lower area and better routability when a design exceeds the dynamic logic capacity.
The benefits of these various approaches (in various combinations in various embodiments) are that FPGA users can now use more of the FPGA resources, and FPGAs can be architected to have a higher number of logic blocks. These extra logic blocks can be used as logic blocks for netlists that are easy to route. Additionally, for Efinix FPGAs based on the exchangeable logic and routing (XLR) tile, e.g., Quantum™ architecture FPGAs, the extra logic blocks can be re-purposed as routing for harder to route netlists.
Referring to
Processing logic determines whether the logic capacity is greater than the logic usage associated with the synthesized design, in a determination action 110. If so, the process proceeds to place & route 114. If the logic usage is greater than the dynamically generated logic capacity, the user will not be allowed to continue to place & route. In that case, the tool has predicted that there is not enough logic and routing to satisfy the user design and user is prevented from going through the whole CAD flow. If such is the case, the user needs to reduce the logic used and/or the connectivity, in an action 112. In other words, the user will then need to rework their design to reduce wiring or logic. Alternatively, in one embodiment, the processing logic provides a synthesis option to target area/congestion. In another alternative, part of a physical synthesis flow automatically targets area/congestion if the design is too large. For example, there could be area thresholds, congestion thresholds, etc., that are system set or user settable. The system could alert a user, for example through a user interface, if congestion exceeds a threshold.
One form of useful information is that the CAD tool 206 generates a routability estimation 106 for the design for the circuit, which informs the user and/or the system about the amount of routing that the design for the circuit is estimated to consume when implemented on an FPGA. One form of useful information is that the CAD tool 206 generates a logic usage estimation 218 for the design for the circuit, which informs the user and/or the system about the amount of logic that the design for the circuit is estimated to consume when implemented on an FPGA. The format and content of the routability estimation 106 and logic usage estimation 218 may be implementation specific, and could include counts, ratios, ranges and/or sizes of estimated routing generally or more specifically such as signal counts, bus sizes and counts, clock counts and routing, etc., and counts, ratios, ranges and/or sizes of estimated logic generally or more specifically such as logic gate counts overall or for various types of logic, etc.
One form of useful information generated in some embodiments is that the CAD tool 206 generates an implementability determination 220, which informs the user and/or the system about whether or not the design for the circuit, as presented in the high level coding 202, is implementable on a specific FPGA. For example, the CAD system 204 could access information about one or more specific FPGAs, or the user could enter a selection for a specific FPGA that is intended for programming in the circuit that the user has entered in the high level coding 202. The CAD system 204 compares the routability estimation 106 and the logic usage estimation 218 for the design for the circuit to available resources of an FPGA (see
In some embodiments, the system awaits further input, and in other embodiments, the system proceeds automatically. For a positive implementability determination 220, in one embodiment the CAD system 204 generates FPGA-specific coding 212 of the design, e.g., design physical synthesis for a specific FPGA, for use in programming the FPGA. Further, the system may proceed and produce a programmed FPGA 214, for example by sending a bitstream to an FPGA that is mounted in a programming socket. Operation of a suitable embodiment of a CAD tool 206 is further described below with reference to
In the embodiment shown in
On the left side in
On the right side in
In one embodiment, estimation 414 is performed using Rent's rule 416, which relates the number of external signal connections to a logic block and the number of logic gates in a logic block, and which is well-known in the art. Various embodiments of the CAD tool 206 and related systems and methods could use Rent's rule itself, variations of Rent's rule, e.g. with additional parameters, or an equation based on Rent's rule. One embodiment uses Rent's parameter, logic density and pin density as inputs to an equation that is a variation of Rent's rule with additional parameters, tuned for types of circuits in FPGAs. For example, a systolic array that needs connection only among nearest neighbors has a connectivity that is largely serviced by the placement of the array members, and has relatively lower routability requirement. A crossbar design where every input has to be switched to every output is routing heavy and has a relatively higher routability requirement. Example Rent parameters may range from 0.5 (or a little bit lower) for a systolic array, to 0.7 or 0.75 for a crossbar design, inside of this range for many types of logic circuits, and outside of this range for unusual circuits. Rent's rule 416 is readily applied with a Rent parameter, and may be applied iteratively, recursively and/or hierarchically in various embodiments. For example, a hierarchical design can be developed by the system from the high level coding 202 of the design for the circuit, and the signal connections and logic gates determined at the various levels in the hierarchical design as the routability estimation 106 and logic usage estimation 218 are generated. In some embodiments, the system uses various forms of partitioning during analysis, such as bipartitioning as described below with reference to
Various embodiments of the CAD tool 206 perform estimation 414 (see
For an example equation that is based on Rent's rule and suitable in an embodiment of CAD tool 206, the system could define utilization as follows:
where normalizer is a parameter that would be equal pin_density*rent_parameter when those values are low enough that you can use 100% of all logic elements for logic and not routing. Variations and algorithmic usage of this example equation, including versions with further parameters, versions operating at multiple levels of an FPGA design, and versions for multiple FPGA architectures, are readily developed in keeping with the teachings herein.
In various embodiments, the system performs trade-offs 418 and iteration 420. Trade-offs 418 and iteration 420 can be automated 424 and/or user-directed 422. In one embodiment, the system can target congestion 426. In various embodiments, the system can annotate 428 the design, in an appropriate format. Take for example logic duplication. Logic duplication is done to improve delay, so that chokepoints in the graph of a circuit are minimized by duplicating nodes where the chokepoint is found. Signals going through a chokepoint are split into two, which improves performance in terms of speed but also increases area. This is a trade-off and an option that synthesis can explore to optimize. For example, there could be trade-offs 418 in speed versus area on a global basis, a local basis, a hierarchical basis in the design, etc., and these can be automated or guided through user interaction, in various embodiments. A trade-off 418 in speed versus area on a global basis could be implemented through user or system selection to emphasize speed for the entire FPGA, or emphasize minimizing area for a circuit implementation on the FPGA. A trade-off 418 in speed versus area on a local basis could be implemented through user or system selection of specific portion(s) of the design, in which emphasize speed, or minimize area. A trade-off 418 in speed versus area on a hierarchical basis could be implemented through user or system selection in a hierarchical design of a specific level of the hierarchy, or portion of a design at a specific level in the hierarchy, in which to emphasize speed, or minimize area. The system could annotate 428 a design database 210 (see
As an example of how this could work at various levels in a design, the CAD tool 206 could analyze various costs inside of optimization algorithms. A cost for minimizing wire(s), a cost for minimizing critical path delay, and a relative weighting of wire cost versus critical path cost, could each be applied during analysis 412 and trade-offs 418 at each level in a hierarchical design (or for that matter, a flat design).
As a further example of how this could work in a hierarchical design, consider a design at top level, then a lower level has a DDR (double data rate) interface, and another level has an ALU (arithmetic logic unit). The system can analyze how many logic elements are being used for the DDR interface, how many for the ALU, how is routing done for each of these, and so on. The system could provide a synthesis option for a specific module, or respond to automated or user direction to work harder on increasing routing efficiency, speed or decreasing area for the module. In some versions, the system could annotate 428 the design, for example in a database 210 (see
In an action 602, the system receives a high level design. The high level design is a design for a circuit to be implemented in an FPGA, and is represented in an appropriate high level coding.
In an action 604, the system performs synthesis. Synthesis (e.g., logic synthesis) results in a synthesized design, in an appropriate format for the CAD system. At this point in the method, the synthesized design is independent of any specific FPGA, for example the synthesized design has not yet been technology mapped.
In an action 606, the system performs routability and logic usage estimation. Various techniques, parameters, and specifics for FPGAs, are discussed above as applicable for various embodiments performing estimation.
In a determination action 608, the system determines whether the synthesized design is implementable on a specific FPGA. For example, the routability and logic usage estimation can be compared to available resources of the FPGA on an absolute, relative or ratio basis, etc., for such a determination. If the result is yes, the synthesized design is implementable on a specific FPGA, flow proceeds to the determination action 610. If the answer is no, the synthesized design is not implementable on a specific a PGA, flow proceeds to the determination action 614.
In the determination action 610, it is determined whether to proceed. If the answer is no, do not proceed, flow branches to the determination action 614. If the answer is yes, do proceed, flow proceeds to the action 612, to generate FPGA coding. FPGA coding could be in the FPGA-specific coding 212 of the design as shown in
In the determination action 614, it is determined whether there are any changes. If no, there are no changes, the system goes to wait 616 (e.g., wait for additional instructions), or optionally exits from the flow. If yes, there are changes, flow proceeds back to the action 606 to again perform the routability and logic usage estimation. For example, changes could be automated, or through user interaction, or a combination of both. Changes could be guided by some of the factors discussed in
Thus, one goal of some embodiments described herein, and a specific usage of an embodiment, is to determine whether or not a design for a circuit fits available resources of a specific FPGA. One goal of some embodiments described herein is to successfully generate FPGA coding, for a design for a circuit that it is determined fits the available resources of a specific FPGA. One goal of some embodiments described herein is to provide useful information as feedback to the system and the user, during the design flow for a design to be implemented on an FPGA.
In various embodiments of a method, a CAD tool, and a CAD system, various indications of targeting congestion, trade-offs, system determinations, user interaction with a CAD tool, the design and implementation of the design in an FPGA are presented to a user through a user interface. Specific aspects of such a user interface are readily devised and may be specific to a FPGA product line or product family, a manufacturer of FPGAs, a design flow, or a design of a CAD tool or CAD system.
Some portions of the detailed descriptions above are presented in terms of algorithms and symbolic representations of operations on data bits within a computer memory. These algorithmic descriptions and representations are the means used by those skilled in the data processing arts to most effectively convey the substance of their work to others skilled in the art. An algorithm is here, and generally, conceived to be a self-consistent sequence of steps leading to a desired result. The steps are those requiring physical manipulations of physical quantities. Usually, though not necessarily, these quantities take the form of electrical or magnetic signals capable of being stored, transferred, combined, compared, and otherwise manipulated. It has proven convenient at times, principally for reasons of common usage, to refer to these signals as bits, values, elements, symbols, characters, terms, numbers, or the like.
It should be borne in mind, however, that all of these and similar terms are to be associated with the appropriate physical quantities and are merely convenient labels applied to these quantities. Unless specifically stated otherwise as apparent from the following discussion, it is appreciated that throughout the description, discussions utilizing terms such as “processing” or “computing” or “calculating” or “determining” or “displaying” or the like, refer to the action and processes of a computer system, or similar electronic computing device, that manipulates and transforms data represented as physical (electronic) quantities within the computer system's registers and memories into other data similarly represented as physical quantities within the computer system memories or registers or other such information storage, transmission or display devices.
The present invention also relates to apparatus for performing the operations herein. This apparatus may be specially constructed for the required purposes, or it may comprise a general-purpose computer selectively activated or reconfigured by a computer program stored in the computer. Such a computer program may be stored in a computer readable storage medium, such as, but is not limited to, any type of disk including floppy disks, optical disks, CD-ROMs, and magnetic-optical disks, read-only memories (ROMs), random access memories (RAMs), EPROMs, EEPROMs, magnetic or optical cards, or any type of media suitable for storing electronic instructions, and each coupled to a computer system bus.
The algorithms and displays presented herein are not inherently related to any particular computer or other apparatus. Various general-purpose systems may be used with programs in accordance with the teachings herein, or it may prove convenient to construct more specialized apparatus to perform the required method steps. The required structure for a variety of these systems will appear from the description below. In addition, the present invention is not described with reference to any particular programming language. It will be appreciated that a variety of programming languages may be used to implement the teachings of the invention as described herein.
A machine-readable medium includes any mechanism for storing or transmitting information in a form readable by a machine (e.g., a computer). For example, a machine-readable medium includes read only memory (“ROM”); random access memory (“RAM”); magnetic disk storage media; optical storage media; flash memory devices; electrical, optical, acoustical or other form of propagated signals (e.g., carrier waves, infrared signals, digital signals, etc.); etc.
Whereas many alterations and modifications of the present invention will no doubt become apparent to a person of ordinary skill in the art after having read the foregoing description, it is to be understood that any particular embodiment shown and described by way of illustration is in no way intended to be considered limiting. Therefore, references to details of various embodiments are not intended to limit the scope of the claims which in themselves recite only those features regarded as essential to the invention.
The present application claims the benefit under 35 USC 119(e) of U.S. Provisional Patent Application No. 63/144,877 filed Feb. 2, 2021, entitled “DYNAMIC FPGA LOGIC CAPACITY BASED ON ACCURATE EARLY ROUTABILITY ESTIMATION” and which is incorporated by reference in their entirety.
Number | Date | Country | |
---|---|---|---|
63144877 | Feb 2021 | US |