In modern processor implementations, variations in process, voltage, and temperature (PVT) can cause malfunctions. PVT variations may become more pronounced as processors scale-up. On-die temperature variation is one of many PVT-related problems that can cause problems in processors, systems on a chip, application specific integrated circuits, etc. On-die temperature variation refers to a temperature gradient in different parts of a chip. In some instances, temperature variation leads to “hotspots” within a chip, where certain areas exhibit significantly higher temperatures than nearby areas. In some instances, hotspots may result from clustering high-activity networks and connected devices in certain areas. Under many workloads, the clustered high-activity networks may cause hotspots on the chip. These hotspots can negatively affect chip performance, such as by causing EM/IR issues, NBTI/PBTI issues, additional leakage power dissipation (e.g., leakage power may almost double with every 10 C rise in temperature).
In modern sign-off processes, designers typically focus on timing, power sign-off, and some reliability issues. However, thermal issues may be left largely unaddressed.
The present embodiments may be better understood, and numerous objects, features, and advantages made apparent to those skilled in the art by referencing the accompanying drawings.
The description that follows includes exemplary systems, methods, techniques, instruction sequences and computer program products that embody techniques of the present inventive subject matter. However, some embodiments may be practiced without these specific details. In other instances, well-known instruction instances, protocols, structures and techniques have not been shown in detail to clarify this description.
Some embodiments of the inventive subject matter enable chip designers to locate thermal hotspots on a chip using arithmetic techniques. Therefore, embodiments avoid using complex computations involved in thermal equations, such as Maxwell's relations. Because embodiments avoid complex thermal equations, they allow chip designers to more quickly reconfigure chip designs to avoid hotspots.
Some embodiments determine a temperature variable (ΔT) for a grid (i.e., a chip area) based on length of network connections between devices in the grid, network switching factors between devices in the grid, and the geometric area of the grid. Some embodiments use ΔT as a parameter into a derivative function from the standard Gaussian/Normal distribution function, where ΔT can be used to control height and width of the derivative function. Using the derivative function, embodiments can identify grids that exhibit unacceptably high temperatures. After identifying high temperature grids, embodiments can reconfigure networks in those grids to avoid high switching factors and other conditions that cause unacceptably high temperatures.
This section will provide additional details about how some embodiments identify hotspots on a microchip.
The equations are as follows:
where
Area—wt(A,B)=[{0.5*{Area(A)+Area(B)}}}*Area(Grid)−1]βa*[kd*Distance(A,B)]βd
In some embodiments, the following apply to Equation 1:
Equation 1 can be extended to multiple fan-out networks and also approximated for the networks which are falling within multiple grids etc. Furthermore, although
After determining ΔT Grid, some embodiments create a bell-shaped plot (representing a network level temperature sensitivity function) to determine grid-level temperature on a chip. That is, some embodiments utilize a Gaussian/Normal distribution plot, where a scalar number from Equation 1 can be used as parameter to control height and width (2-D circular spread) of the Gaussian/Normal function. For example, some embodiments, use ΔT Grid as input for σ in the following Gaussian equation.
Some embodiments can make the model more accurate. Some embodiments can associate a factor of (1/total number of networks connected) for each device instead of 50% device area assigned. This may be called an “effective area function.” In some embodiments, the Area_wt function formulation for multiple fan-out networks can be more computationally intensive. The distance function can be computed using “centroid” of all connected devices and root-square-mean or average of distance of each device from centroid location. In some embodiments, the Area function may be computed by simply extending to all connected device multiplied by an “effective area fraction”.
Some embodiments determine a thermal sensitivity function at the grid level of the chip, or at other levels of granularity (e.g., network level).
Embodiments can determine thermal sensitivity at different chip granularities. For example, embodiments can determine thermal sensitivity at the interconnect level (i.e., two devices and one interconnect), network level (i.e., multiple devices and multiple interconnects), grid level (i.e, defined number of columns on the chip), etc. For each level of granularity, embodiments assume heat is emanating from the center of the region (e.g., center of a network). For example, although heat may be emanating all along a network's interconnects and devices, embodiments assume the heat is emanating from the center of network. Similarly, at the grid level, embodiments assume heat is emanating from the center of the grid. In the Guassian equation shown above, μ refers to the x, y coordinates from which heat is emanating. According to some embodiments, the x, y coordinates coincide with the center of the region (e.g., grid, network, interconnect, etc.) whose thermal sensitivity is being determined. Embodiments can determine thermal sensitivity of a larger region (e.g., a network) by adding thermal sensitivity plots for a plurality of smaller regions (e.g., interconnects).
and
μz=μx+μy
When combing plots for two or more regions that share a center, embodiments need not add the μ factors. Embodiments can determining σz (i.e., combined value of the σ terms), and generate a composite thermal sensitivity plot by using the Guassian Equations shown above.
The system memory 806 includes a chip design unit 801, which includes a thermal unit 803. The chip design unit 801 can perform any of the operations and calculations described herein. For example, the chip design unit's thermal unit 803 can perform operations on the right side of the flow 600. Furthermore, the thermal unit 803 can determine ΔT Grid and use it as a parameter into the Guassian function described herein. Any one of these functionalities may be partially (or entirely) implemented in hardware and/or on the processing unit 802. For example, the functionality may be implemented with an application specific integrated circuit, in logic implemented in the processing unit 802, in a co-processor on a peripheral device or card, etc. Further, realizations may include fewer or additional components not illustrated in
As will be appreciated by one skilled in the art, aspects of the present inventive subject matter may be embodied as a system, method or computer program product. Accordingly, aspects of the present inventive subject matter may take the form of an entirely hardware embodiment, an entirely software embodiment (including firmware, resident software, micro-code, etc.) or an embodiment combining software and hardware aspects that may all generally be referred to herein as a “circuit,” “module” or “system.” Furthermore, aspects of the present inventive subject matter may take the form of a computer program product embodied in one or more computer readable medium(s) having computer readable program code embodied thereon.
Any combination of one or more computer readable medium(s) may be utilized. The computer readable medium may be a computer readable signal medium or a computer readable storage medium. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a random access memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or Flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
A computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.
Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
Computer program code for carrying out operations for aspects of the present inventive subject matter may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C++ or the like and conventional procedural programming languages, such as the “C” programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the latter scenario, the remote computer may be connected to the user's computer through any type of network, including a local area network (LAN) or a wide area network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet Service Provider).
Aspects of the present inventive subject matter are described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the inventive subject matter. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer readable medium that can direct a computer, other programmable data processing apparatus, or other devices to function in a particular manner, such that the instructions stored in the computer readable medium produce an article of manufacture including instructions which implement the function/act specified in the flowchart and/or block diagram block or blocks.
The computer program instructions may also be loaded onto a computer, other programmable data processing apparatus, or other devices to cause a series of operational steps to be performed on the computer, other programmable apparatus or other devices to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide processes for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks.
While the embodiments are described with reference to various implementations and exploitations, it will be understood that these embodiments are illustrative and that the scope of the inventive subject matter is not limited to them. Many variations, modifications, additions, and improvements are possible.
Plural instances may be provided for components, operations or structures described herein as a single instance. Finally, boundaries between various components, operations and data stores are somewhat arbitrary, and particular operations are illustrated in the context of specific illustrative configurations. Other allocations of functionality are envisioned and may fall within the scope of the inventive subject matter. In general, structures and functionality presented as separate components in the exemplary configurations may be implemented as a combined structure or component. Similarly, structures and functionality presented as a single component may be implemented as separate components. These and other variations, modifications, additions, and improvements may fall within the scope of the inventive subject matter.
Number | Name | Date | Kind |
---|---|---|---|
5600578 | Fang et al. | Feb 1997 | A |
5784289 | Wang | Jul 1998 | A |
6791343 | Ramarao et al. | Sep 2004 | B2 |
6940293 | Ramarao et al. | Sep 2005 | B2 |
7210115 | Rahim et al. | Apr 2007 | B1 |
7299442 | Alpert et al. | Nov 2007 | B2 |
7428720 | Kanno et al. | Sep 2008 | B2 |
7681165 | Peters et al. | Mar 2010 | B2 |
7714610 | He | May 2010 | B2 |
7823102 | Chandra et al. | Oct 2010 | B2 |
7827510 | Schubert et al. | Nov 2010 | B1 |
7992120 | Wang et al. | Aug 2011 | B1 |
8082137 | Li et al. | Dec 2011 | B2 |
8104006 | Kariat et al. | Jan 2012 | B2 |
8356270 | Burd et al. | Jan 2013 | B2 |
8595671 | He | Nov 2013 | B2 |
20040100286 | Ramarao et al. | May 2004 | A1 |
20050012509 | Ramarao et al. | Jan 2005 | A1 |
20050251775 | Wood | Nov 2005 | A1 |
20060258132 | Brown et al. | Nov 2006 | A1 |
20070164785 | He | Jul 2007 | A1 |
20070198971 | Dasu et al. | Aug 2007 | A1 |
20080195984 | Dougherty et al. | Aug 2008 | A1 |
20090024969 | Chandra | Jan 2009 | A1 |
20090132834 | Chaudhry et al. | May 2009 | A1 |
20090278564 | Dehon et al. | Nov 2009 | A1 |
20100231263 | Fish et al. | Sep 2010 | A1 |
20100281448 | He | Nov 2010 | A1 |
20120005639 | Fish et al. | Jan 2012 | A1 |
20120019834 | Bornhop | Jan 2012 | A1 |
20120096424 | Burd et al. | Apr 2012 | A1 |
Entry |
---|
Alpert, Charles J. et al., “The Importance of Routing Congestion Analysis”, DAC.COM Knowledge Center Article www.dac.com Design Automation Conference May 4, 2010 , 14 pages. |
Disclosed Anonymously, ,“Congestion mitigation by Congestion Optimized Library Cell Swapping and Selective Standard Cell Repulsion”, An IP.com Prior Art Database Technical Disclosure Aug. 30, 2012 , 10 pages. |
Huang, Wei et al., “Compact Thermal Modeling for Temperature-Aware Design”, (wh6p, mircea, sg7w)@virginia.edu, (skadron, karthik, sv7d)@cs.virginia.edu Apr. 2004 , pp. 1-17. |
Krishnamoorthy, Srini et al., “Switching Constraint-driven Thermal and Reliability Analysis of Nanometer Designs”, IEEE publication {srini.krishnamoorthy, vishak.venkatraman, yuri.apanovich, tom.burd, anand.daga}@amd.com 2011 , 8 pages. |
Krishnan, Vyas , “Temperature and interconnect aware unified physical and high level synthesis”, Graduate School Theses and Dissertations. 2008 , 232 pages. |
Liu, Wei , “Power and Thermal Management of System-on-Chip”, www.imm.dtu.dk 2011 , 152 pages. |
Siozios, Kostas et al., “A Power-Aware Placement and Routing Algorithm Targeting 3D FPGAs”, Journal of Low-Power Electronics (JOLPE), vol. 4, No. 3 Dec. 2008 , 36 pages. |
Wei, Yaoguang et al., “GLARE: Global and Local Wiring Aware Routability Evaluation”, 2012 , pp. 768-773. |