As technology advances in the semiconductor field, devices such as processors incorporate ever-increasing amounts of circuitry. Over time, processor designs have evolved from a collection of independent integrated circuits (ICs), to a single integrated circuit, to multicore processors that include multiple processor cores within a single IC package. As time goes on, ever greater numbers of cores and related circuitry are being incorporated into processors and other semiconductors.
Multicore processors are being extended to include additional functionality by incorporation of other functional units within the processor. Typically, a multicore processor has a common power budget and a common thermal budget. The power budget is set so that a specified power level, at least as averaged over time, is not exceeded. The thermal budget is set such that a thermal throttle point, which is a highest allowable temperature at which the processor can safely operate, is not exceeded. Although these common budgets exist, mechanisms to adaptively share the budgets across the wide variety of circuitry present in a processor does not exist.
In various embodiments, a processor having multiple independent domains can be controlled to prevent a temperature of any of the domains from exceeding a maximum junction temperature of the processor. As used herein, the term “maximum junction temperature” is a highest temperature at which a semiconductor product is specified to be fully operational. As an example, this temperature can be determined during device characterization, both during manufacture as well as testing in a laboratory environment, and stored in a non-volatile storage of the device. In general, various circuitry of a processor or other product may be present to provide a throttling mechanism such that the processor is prevented from operating above this maximum junction temperature.
Examples described herein are in connection with a multicore processor including multiple processor cores and one or more other processing engines. For example, in a particular embodiment described herein, at least two independent domains may be present, namely a core domain that includes one or more cores and a graphics domain that includes one or more graphics engines. As used herein the term “domain” is used to mean a collection of hardware and/or logic that operates at the same voltage and frequency point. As an example, a multicore processor can further include other non-core processing engines such as fixed function units, graphics engines, and so forth. Such processor can include at least two independent domains, one associated with the cores (referred to herein as a core domain) and one associated with a graphics engine (referred to herein as a graphics domain). Although many implementations of a multi-domain processor can be formed on a single semiconductor die, other implementations can be realized by a multi-chip package in which different domains can be present on different semiconductor die of a single package.
In various embodiments, cross-domain effects with regard to temperature, such that an increased temperature of one domain has an effect on the temperature of another domain can be considered and taken into account in controlling temperature of each domain. More specifically, embodiments may provide a programmable temperature margin parameter, referred to herein as a “cross-domain margin,” that is used to determine a marginal temperature level above a throttle temperature on a first domain at which a power consumption level of a different domain is to be reduced in order to allow the first domain temperature to begin cooling. This cross-domain margin may apply even if the power consumption of the different domains does not change over time. As with the throttle points, the cross-domain margin can be stored in a non-volatile storage of the processor.
Embodiments may be performed in various locations. As one example, logic of a power control unit (PCU) of a processor can be used to perform the thermal analysis and temperature control in accordance with an embodiment of the present invention. Referring now to
Method 100 may begin by determining the appropriate domain under review. Thus at diamond 110 it can be determined whether the domain under review is a core domain. The embodiment of
At diamond 120 it can be determined whether the temperature of the core domain is greater than a throttle point. Although the scope of the present invention is not limited in this regard, in some embodiments this throttle point may correspond to a maximum junction temperature (or Tj) at which a given domain of the processor is specified to be fully operational without breakdown. Note that this Tj can be fixed, e.g., in non-volatile storage or fuse logic. However, this throttle point can be configurable, e.g., by software or firmware to a value lower than Tj. In this way, an original equipment manufacturer (OEM) can dial down the throttle point as a function of Tj (e.g., using basic input/output system (BIOS)). As an example and not for purposes of limitation, for a multicore processor the throttle point for a core domain can be set at between approximately 80 and 110 degrees Celsius (and may be set between approximately the same or a different range for a non-core domain). In some embodiments, this throttle point can be determined by testing, e.g., during fabrication, and stored in a non-volatile storage or other mechanism of the processor. However, in other embodiments the throttle point can be dynamically changed, e.g., based on a history of the semiconductor product such that the throttle point can be reduced as the product ages due to various degradation mechanisms of the semiconductor product.
Note that this throttle point can be reached at different performance levels of a processor. For example, according to an operating system (OS)-based mechanism, namely the Advanced Configuration and Platform Interface (ACPI) standard (e.g., Rev. 3.0b, published Oct. 10, 2006), a processor can operate at various performance states or levels, namely from P0 to PN. In general, the P1 performance state may correspond to the highest guaranteed performance state that can be requested by an OS. In addition to this P1 state, the OS can further request a higher performance state, namely a P0 state. This P0 state may thus be an opportunistic state in which, when power and/or thermal budget is available, processor hardware can configure the processor or at least portions thereof to operate at a higher than guaranteed frequency. In many implementations a processor can include multiple so-called bin frequencies above a guaranteed maximum frequency, also referred to as a P1 frequency. At any of these performance states, and more likely in a P0 or P1 state, the throttle point may be reached.
If it is determined that the core domain temperature is greater than this throttle point, control passes to block 140 where a frequency of the core domain can be reduced by a selected amount. This reduction in frequency in turn causes the power consumption of the domain also to decrease, leading to a decrease in the domain's temperature. As one example, the selected amount may correspond to a degradation of a predetermined amount of a frequency bin. For example, the core domain frequency can be reduced by 1/N of a bin frequency. As used herein, a “bin frequency” corresponds to a smallest multiple by which a domain frequency can be updated. In some embodiments this bin frequency can be an integer multiple of a bus clock frequency, although the scope of the present invention is not limited in this regard.
Note that in many implementations, rather than performing frequency reductions by less than a bin frequency amount, embodiments allow for multiple iterations of the frequency control algorithm of
Note that the mechanism to reduce the core domain frequency can be performed in different manners. For example, frequency control logic of the PCU can receive an instruction to update the core domain frequency. In turn, the frequency control logic may select various instructions to be sent to cause the frequency to be reduced. For example, various control signals can be sent to one or more phase lock loops (PLLs) or other frequency control mechanisms to cause the frequency to be reduced. After this frequency change, control passes to block 180, where a check for temperature of another domain can begin.
If instead at diamond 120 it is determined that the core domain temperature is not greater than the throttle point, control passes next to diamond 130, where it can be determined whether the graphics domain temperature is greater than a sum of the throttle point (for the graphics domain) and a cross-domain margin. Thus as discussed above, a temperature of one domain can affect the temperature of another domain. And accordingly, if the graphics domain temperature is greater than this sum of throttle point and cross-domain margin, the same adjustment to the core domain frequency can be performed, namely a reduction by a selected amount occurs at block 140. Of course, a different amount of reduction can be effected in this case, e.g., 1/X of a bin frequency. Note that both X and N are independent parameters that can be stored in non-volatile storage.
Otherwise, there is no potential thermal violation and accordingly the current frequency of the core domain can be maintained. Thus at this point, control passes to diamond 150, where an analysis with regard to the graphics domain temperature can be made. As seen, this determination can also be reached if the domain under review is the graphics domain. Thus at diamond 150 it can be determined whether the graphics domain temperature is greater than a throttle point. If so, control passes to block 170 where the graphics domain frequency can be reduced by a selected amount. Note that this reduction in graphics domain frequency can proceed as discussed above with regard to block 140. For example, the graphics frequency can be reduced by 1/N of a bin frequency. Note that the different domains can have different bin frequencies. Control thereafter passes back to block 180 discussed above.
Still referring to
Thus as seen,
Note that the value of N can be a tunable or programmable parameter that determines the rate of cooling achieved by thermal throttling in accordance with an embodiment of the present invention. For a small value of N (e.g., less than approximately 2), the rate at which the frequency can be reduced is faster than for a larger value of N. However, such smaller N values may lead to oscillations in which a domain can be overcooled. In some embodiments, the value of N may be programmable, e.g., by a user using a user-level instruction. In other embodiments, the value of N can be configured via a configuration register or stored in a non-volatile storage, e.g., of the PCU.
Thus in a multi-domain processor, embodiments can enable thermal control such that not only does a domain that is operating above its thermal specification have its temperature reduced, but at least one other domain within the same die can have its temperature reduced. In this way, the contribution to thermal heating of the domain under consideration by another domain can be reduced or removed.
Referring now to
In various embodiments, power control unit 355 may include a thermal control logic 359, which may be a logic to control domain frequencies based on temperature not only on the domain to be controlled, but also other domains of the processor. In the embodiment of
With further reference to
Referring now to
In general, each core 410 may further include low level caches in addition to various execution units and additional processing elements. In turn, the various cores may be coupled to each other and to a shared cache memory formed of a plurality of units of a last level cache (LLC) 4400-440n. In various embodiments, LLC 450 may be shared amongst the cores and the graphics engine, as well as various media processing circuitry. As seen, a ring interconnect 430 thus couples the cores together, and provides interconnection between the cores, graphics domain 420 and system agent circuitry 450.
In the embodiment of
As further seen in
Embodiments may be implemented in many different system types. Referring now to
Still referring to
Furthermore, chipset 590 includes an interface 592 to couple chipset 590 with a high performance graphics engine 538, by a P-P interconnect 539. In turn, chipset 590 may be coupled to a first bus 516 via an interface 596. As shown in
Embodiments may be implemented in code and may be stored on a non-transitory storage medium having stored thereon instructions which can be used to program a system to perform the instructions. The storage medium may include, but is not limited to, any type of disk including floppy disks, optical disks, solid state drives (SSDs), compact disk read-only memories (CD-ROMs), compact disk rewritables (CD-RWs), and magneto-optical disks, semiconductor devices such as read-only memories (ROMs), random access memories (RAMs) such as dynamic random access memories (DRAMs), static random access memories (SRAMs), erasable programmable read-only memories (EPROMs), flash memories, electrically erasable programmable read-only memories (EEPROMs), magnetic or optical cards, or any other type of media suitable for storing electronic instructions.
While the present invention has been described with respect to a limited number of embodiments, those skilled in the art will appreciate numerous modifications and variations therefrom. It is intended that the appended claims cover all such modifications and variations as fall within the true spirit and scope of this present invention.
Number | Name | Date | Kind |
---|---|---|---|
5163153 | Cole et al. | Nov 1992 | A |
5287292 | Kenny et al. | Feb 1994 | A |
5522087 | Hsiang | May 1996 | A |
5590341 | Matter | Dec 1996 | A |
5621250 | Kim | Apr 1997 | A |
5931950 | Hsu | Aug 1999 | A |
6748546 | Mirov et al. | Jun 2004 | B1 |
6792392 | Knight | Sep 2004 | B1 |
6823516 | Cooper | Nov 2004 | B1 |
6829713 | Cooper et al. | Dec 2004 | B2 |
6908227 | Rusu et al. | Jun 2005 | B2 |
6996728 | Singh | Feb 2006 | B2 |
7010708 | Ma | Mar 2006 | B2 |
7043649 | Terrell | May 2006 | B2 |
7093147 | Farkas et al. | Aug 2006 | B2 |
7111179 | Girson et al. | Sep 2006 | B1 |
7146514 | Kaushik et al. | Dec 2006 | B2 |
7194643 | Gonzalez et al. | Mar 2007 | B2 |
7272730 | Acquaviva et al. | Sep 2007 | B1 |
7412615 | Yokota et al. | Aug 2008 | B2 |
7434073 | Magklis | Oct 2008 | B2 |
7437270 | Song et al. | Oct 2008 | B2 |
7454632 | Kardach et al. | Nov 2008 | B2 |
7529956 | Stufflebeam | May 2009 | B2 |
7539885 | Ma | May 2009 | B2 |
7574321 | Kernahan et al. | Aug 2009 | B2 |
7596464 | Hermerding et al. | Sep 2009 | B2 |
7603577 | Yamaji et al. | Oct 2009 | B2 |
7624215 | Axford et al. | Nov 2009 | B2 |
7730340 | Hu et al. | Jun 2010 | B2 |
7752467 | Tokue | Jul 2010 | B2 |
7797512 | Cheng et al. | Sep 2010 | B1 |
7966511 | Naveh et al. | Jun 2011 | B2 |
8015427 | Miller et al. | Sep 2011 | B2 |
8601288 | Brinks et al. | Dec 2013 | B2 |
20010044909 | Oh et al. | Nov 2001 | A1 |
20020194509 | Plante et al. | Dec 2002 | A1 |
20030061383 | Zilka | Mar 2003 | A1 |
20040030940 | Espinoza-Ibarra et al. | Feb 2004 | A1 |
20040064752 | Kazachinsky et al. | Apr 2004 | A1 |
20040098560 | Storvik et al. | May 2004 | A1 |
20040139356 | Ma | Jul 2004 | A1 |
20040268166 | Farkas et al. | Dec 2004 | A1 |
20050022038 | Kaushik et al. | Jan 2005 | A1 |
20050033881 | Yao | Feb 2005 | A1 |
20050046400 | Rotem | Mar 2005 | A1 |
20050132238 | Nanja | Jun 2005 | A1 |
20050223258 | Watts | Oct 2005 | A1 |
20050288886 | Therien et al. | Dec 2005 | A1 |
20060006166 | Chen et al. | Jan 2006 | A1 |
20060041766 | Adachi | Feb 2006 | A1 |
20060050670 | Hillyard et al. | Mar 2006 | A1 |
20060053326 | Naveh | Mar 2006 | A1 |
20060059286 | Bertone et al. | Mar 2006 | A1 |
20060069936 | Lint et al. | Mar 2006 | A1 |
20060117202 | Magklis et al. | Jun 2006 | A1 |
20060184287 | Belady et al. | Aug 2006 | A1 |
20070005995 | Kardach et al. | Jan 2007 | A1 |
20070016817 | Albonesi et al. | Jan 2007 | A1 |
20070079294 | Knight | Apr 2007 | A1 |
20070101174 | Tsukimori et al. | May 2007 | A1 |
20070106428 | Omizo et al. | May 2007 | A1 |
20070106827 | Boatright et al. | May 2007 | A1 |
20070156992 | Jahagirdar | Jul 2007 | A1 |
20070168151 | Kernahan et al. | Jul 2007 | A1 |
20070214342 | Newburn | Sep 2007 | A1 |
20070234083 | Lee | Oct 2007 | A1 |
20070239398 | Song et al. | Oct 2007 | A1 |
20070245163 | Lu et al. | Oct 2007 | A1 |
20070260895 | Aguilar et al. | Nov 2007 | A1 |
20080005603 | Buch et al. | Jan 2008 | A1 |
20080028240 | Arai et al. | Jan 2008 | A1 |
20080028778 | Millet | Feb 2008 | A1 |
20080077282 | Hartman et al. | Mar 2008 | A1 |
20080077813 | Keller et al. | Mar 2008 | A1 |
20080136397 | Gunther et al. | Jun 2008 | A1 |
20080250260 | Tomita | Oct 2008 | A1 |
20080307240 | Dahan et al. | Dec 2008 | A1 |
20090006871 | Liu et al. | Jan 2009 | A1 |
20090070605 | Nijhawan et al. | Mar 2009 | A1 |
20090150695 | Song et al. | Jun 2009 | A1 |
20090150696 | Song et al. | Jun 2009 | A1 |
20090158061 | Schmitz et al. | Jun 2009 | A1 |
20090158067 | Bodas et al. | Jun 2009 | A1 |
20090172375 | Rotem et al. | Jul 2009 | A1 |
20090172428 | Lee | Jul 2009 | A1 |
20090235105 | Branover et al. | Sep 2009 | A1 |
20090235108 | Gold et al. | Sep 2009 | A1 |
20090271141 | Coskun et al. | Oct 2009 | A1 |
20090271646 | Talwar et al. | Oct 2009 | A1 |
20090313489 | Gunther et al. | Dec 2009 | A1 |
20100058078 | Branover et al. | Mar 2010 | A1 |
20100115309 | Carvalho et al. | May 2010 | A1 |
20100146513 | Song | Jun 2010 | A1 |
20100191997 | Dodeja et al. | Jul 2010 | A1 |
20100250856 | Owen et al. | Sep 2010 | A1 |
20100332927 | Kurts et al. | Dec 2010 | A1 |
20110022865 | Gunther et al. | Jan 2011 | A1 |
20110072429 | Celeskey et al. | Mar 2011 | A1 |
20110093733 | Kruglick | Apr 2011 | A1 |
20110154090 | Dixon et al. | Jun 2011 | A1 |
20110283124 | Branover et al. | Nov 2011 | A1 |
20120053897 | Naffziger | Mar 2012 | A1 |
20120066535 | Naffziger | Mar 2012 | A1 |
20120096288 | Bates et al. | Apr 2012 | A1 |
20120110352 | Branover et al. | May 2012 | A1 |
20120114010 | Branch | May 2012 | A1 |
20120116599 | Arndt et al. | May 2012 | A1 |
20120173907 | Moses et al. | Jul 2012 | A1 |
20130173941 | Ananthakrishnan | Jul 2013 | A1 |
20130246825 | Shannon | Sep 2013 | A1 |
Number | Date | Country |
---|---|---|
101351759 | Jan 2009 | CN |
101403944 | Apr 2009 | CN |
101010655 | May 2010 | CN |
1 282 030 | May 2003 | EP |
10-2006-012846 | Dec 2006 | KR |
I342498 | May 2011 | TW |
I344793 | Jul 2011 | TW |
Entry |
---|
U.S. Appl. No. 13/282,947, filed Oct. 27, 2011, entitled “Controlling Operating Frequency of a Core Domain Via a Non-Core Domain of a Multi-Domain Processor,” by Avinash N. Ananthakrishnan, et al. |
U.S. Appl. No. 13/285,414, filed Oct. 31, 2011, entitled “Controlling a Turbo Mode Frequency of a Processor,” by Avinash N. Ananthakrishnan, et al. |
U.S. Appl. No. 13/225,677, filed Sep. 6, 2011, entitled “Dynamically Allocating a Power Budget Over Multiple Domains of a Processor,” by Avinash N. Ananthakrishnan, et al. |
U.S. Appl. No. 13/285,465, filed Oct. 31, 2011, entitled “Dynamically Controlling Cache Size to Maximize Energy Efficiency,” by Avinash N. Ananthakrishnan, et al. |
U.S. Appl. No. 13/282,896, filed Oct. 27, 2011, entitled “Enabling a Non-Core Domain to Control Memory Bandwidth,” by Avinash N. Ananthakrishnan, et al. |
U.S. Appl. No. 12/889,121, “Providing Per Core Voltage and Frequency Control,” filed Sep. 23, 2010, by Pakaj Kumar. |
SPEC—Power and Performance, Design Overview V1.10, Standard Performance Information Corp., Oct. 21, 2008, 6 pages. |
U.S. Appl. No. 13/070,700, “Obtaining Power Profile Information With Low Overhead,” filed Mar. 24, 2011, by Robert Knight. |
Anoop Iyer, et al., “Power and Performance Evaluation of Globally Asynchronous Locally Synchronous Processors,” 2002, pp. 1-11. |
Greg Semeraro, et al., “Hiding Synchronization Delays In A GALS Processor Microarchitecture,” 2004, pp. 1-13. |
Joan-Manuel Parcerisa, et al., “Efficient Interconnects for Clustered Microarchitectures,” 2002, pp. 1-10. |
Grigorios Magklis, et al., “Profile-Based Dynamic Voltage and Frequency Scalling for a Multiple Clock Domain Microprocessor,” 2003, pp. 1-12. |
Greg Semeraro, et al., “Dynamic Frequency and Voltage Control for a Multiple Clock Domain Architecture,” 2002, pp. 1-12. |
Greg Semeraro, “Energy-Efficient Processor Design Using Multiple Clock Domains with Dynamic Voltage and Frequency Scaling,” 2002, pp. 29-40. |
Diana Marculescu, “Application Adaptive Energy Efficient Clustered Architectures,” 2004, pp. 344-349. |
L. Benini, et al., “System-Level Dynamic Power Management,” 1999, pp. 23-31. |
Ravindra Jejurikar, et al., “Leakage Aware Dynamic Voltage Scaling For Real-Time Embedded Systems,” 2004, pp. 275-280. |
Ravindra Jejurikar, et al., “Dynamic Slack Reclamation With Procrastination Scheduling In Real-Time Embedded Systems,” 2005, pp. 13-17. |
R. Todling, et al., “Some Strategies For Kalman Filtering and Smoothing,” 1996, pp. 1-21. |
R.E. Kalman, “A New Approach To Linear Filtering and Prediction Problems,” 1960, pp. 1-12. |
Intel Technology Journal, “Power and Thermal Management in the Intel Core Duo Processor,” May 15, 2006, pp. 109-122. |
Intel Developer Forum, IDF2010, Opher Kahn, et al., “Intel Next Generation Microarchitecture Codename Sandy Bridge: New Processor Innovations,” Sep. 13, 2010, 58 pages. |
David L. Hill, et al., “The Uncore: A Modular Approach To Feeding The High-Performance Cores,” Intel Technology Journal, 2010, vol. 14, Issue 3, pp. 30-49. |
International Searching Authority, “Notification of Transmittal of the International Search Report and the Written Opinion of the International Searching Authority,” mailed Feb. 28, 2013, in International application No. PCT/US2012/055943. |
U.S. Patent and Trademark Office, Office Action mailed Jul. 31, 2014, in U.S. Appl. No. 13/247,564. |
U.S. Patent and Trademark Office, Office Action mailed Jan. 16, 2014, with Reply filed Apr. 9, 2014, in U.S. Appl. No. 13/247,564. |
U.S. Patent and Trademark Office, Office Action mailed Jun. 6, 2014, with Reply filed Sep. 4, 2014, in U.S. Appl. No. 13/282,947. |
U.S. Patent and Trademark Office, Office Action mailed May 16, 2014, with Reply filed Aug. 12, 2014, in U.S. Appl. No. 13/285,414. |
U.S. Patent and Trademark Office, Office Action mailed Aug. 18, 2014, in U.S. Appl. No. 13/285,465. |
Number | Date | Country | |
---|---|---|---|
20130080804 A1 | Mar 2013 | US |