As technology advances in the semiconductor field, devices such as processors incorporate ever-increasing amounts of circuitry. Over time, processor designs have evolved from a collection of independent integrated circuits (ICs), to a single integrated circuit, to multicore processors that include multiple processor cores within a single IC package. As time goes on, ever greater numbers of cores and related circuitry are being incorporated into processors and other semiconductors.
Multicore processors are being extended to include additional functionality by incorporation of other functional units within the processor. These units share a single power budget. As a result, the higher the frequency at which a core portion of the processor operates, the higher the power consumed by the core portion, leaving lesser power for other portions of the processor. Suitable mechanisms to ensure that these different units have sufficient power do not presently exist.
In various embodiments, a non-core domain of a multi-domain processor can cause a core domain's frequency to be reduced. This frequency reduction can be by way of control of a maximum allowable frequency at which the core domain can run (commonly referred to as a maximum turbo frequency) as well as control of a guaranteed frequency at which the core domain can run. In this way, a non-core domain (or software, firmware, or logic of such domain) can override a setting or request of an operating system (OS) or other scheduler for higher core frequencies. Thus this non-core domain entity can dynamically alter the range of frequencies at which a core domain is enabled to run, and greater power and/or thermal budget can be allocated to the non-core domain. Although described herein with regard to a multi-domain processor including a core domain having one or more processor cores and a non-core domain that in an illustrative embodiment is a graphics engine, understand the scope of the present invention is not limited in this regard and in other embodiments more than two domains can be present, each of which can have dynamically alterable frequency ranges.
By providing this non-core domain control, a common budget for a processor, e.g., a shared power budget, can be more efficiently used. In general, an OS determines an operating frequency for a core domain based on core utilization only. Other domains however may not benefit from a selected operating frequency for the core domain. For example, a graphics domain may not benefit from a given core operating frequency. For a particular graphics workload, a graphics driver, which can execute within the core domain, can perform a so-called spin loop to check to determine whether a parceled amount of work given to the graphics domain has completed. This spin loop consumes power as it iteratively checks to determine whether the work has completed. By always running, this driver can thus increase core utilization and cause an OS to select a relatively high operating frequency at which to run the core domain. Assuming that this type of scenario causes the core domain to enter into a greater than guaranteed frequency such as a given turbo frequency, due to the common power (and thermal) budget, it is possible that operation of the graphics domain that is actually performing the major workload of the processor in this illustration can be constrained. Instead using embodiments of the present invention, the graphics domain can aid in controlling operating frequency of the core domain. In this way, a power budget can be more efficiently shared between core and non-core domains.
Note that the operating frequency for a core domain selected by the OS can be in accordance with an OS-based power control implementation such as the Advanced Configuration and Platform Interface (ACPI) standard (e.g., Rev. 3.0b, published Oct. 10, 2006). According to ACPI, a processor can operate at various performance states or levels, namely from P0 to PN. In general, the P1 performance state may correspond to the highest guaranteed performance state that can be requested by the OS. In addition to this P1 state, the OS can further request a higher performance state, namely a P0 state. This P0 state may thus be an opportunistic state in which, when power and/or thermal budget is available, processor hardware can configure the processor or at least portions thereof to operate at a higher than guaranteed frequency. In many implementations a processor can include multiple so-called bin frequencies above a guaranteed maximum frequency, also referred to as a P1 frequency. As used herein, a “bin frequency” corresponds to a smallest multiple by which a domain frequency can be updated. In some embodiments this bin frequency can be an integer multiple of a bus clock frequency, although the scope of the present invention is not limited in this regard.
Referring now to
If the operating frequency of the non-core domain is lower than the requested frequency, control passes to block 120, where a request can be sent to the power controller to reduce a core domain frequency. That is, responsive to this determination, the driver can send a request to cause the core domain to operate at a lower frequency. Thus in contrast to conventional systems, a non-OS, non-core domain entity can make a request to cause an OS-controlled core domain to operate at a lower frequency. Although the embodiment just discussed is with regard to graphics driver control of the operating frequency range, in other implementations other privileged level code, such as ring zero code can perform this request. Note that in some embodiments, there can be limitations on allowing the graphics driver (and more generally a non-core domain) to control operating frequency of the core domain. For example with respect to a graphics engine, the graphics driver may only be allowed to control the core domain frequency for certain workloads, such as a 3D workload or a full-screen mode operation.
As further seen in
Different manners of effecting a request for reduced frequency of the core domain are possible. In one particular implementation, a non-core domain can have access to certain configuration registers, e.g., of a PCU, that can enable dynamic control of a frequency range at which a core domain can operate. Such dynamic control can thus act to limit the core domain's frequency lower than its configured frequency levels, e.g., as set by fuses and/or configuration information for the particular processor as set during manufacture. In the context of an ACPI implementation, this lowering of the guaranteed operating frequency can thus violate the P1 guaranteed frequency at which the core domain is guaranteed to run.
Referring now to
If the frequency limit is due to a package power limitation, control passes to block 240 where the non-core entity can determine the core domain frequency. In one embodiment, this determination can be made by reference to a current operating frequency register of the PCU that stores a current operating frequency. Then at diamond 250, it can be determined whether this core domain frequency is at or greater than a guaranteed frequency level. As discussed above, the guaranteed frequency level may correspond to a P1 frequency. In this way, the driver can determine whether the core domain is operating at least at its guaranteed frequency point of P1. If not, control passes to block 260, where the driver can cause an update to a configuration register to cause the guaranteed frequency for the core domain to be reduced. As one example, a single configuration register may include multiple fields that expose dynamic frequency range control to a driver for a non-core domain. In this example, a first field of such configuration register can be used to cause a reduction in a guaranteed operating frequency of the core domain. As a result of this update, the PCU logic can, e.g., during a next iteration of its power control firmware, cause a reduction in the guaranteed operating frequency for the core domain. Thus if the core domain is currently operating at the guaranteed operating frequency, this frequency reduction can occur in that next iteration. If not, the information in the field can be used by the PCU logic during subsequent requests by the core domain for entry into the guaranteed performance mode.
Referring still to
As a result of these operations, the core domain will likely consume less power and accordingly a greater amount of a package power budget can be allocated to the non-core domain. Understand that although shown with only two domains in the embodiment of
Referring now to Table 1, shown is an example of a configuration register, which may be present within a PCU, to enable non-core domain control of a core domain operating frequency.
Table 1 is thus for a single configuration register example including multiple fields, one of which to enable control of maximum turbo frequency and a second field to enable control of guaranteed frequency. Specifically in the embodiment of Table 1, the PST_OFFSET field provides, e.g., a graphics driver, a means of lowering the fused P1 frequency of the core domain. Assuming a bus clock (BCLK) frequency of 100 megahertz (MHz), an entry of 3 in the PST_OFFSET field corresponds to a request to lower the core domain guaranteed frequency by 3×100=300 MHz. In turn, a second field, namely PST_LIM, provides a means of capping the highest frequency at which the core domain can execute. Again assuming a BCLK frequency of 100 MHz, an entry of 20 in the PST_LIM field corresponds to a request to lower the maximum turbo frequency to 20×100=2000 MHz. Note that this implementation of Table 1 thus provides a ratioed value corresponding to the bus clock frequency. Instead in other embodiments rather than a ratio amount, a fixed value control can be provided, such that the values written to the fields of the configuration register can correspond to the maximum and minimum of the frequency range, respectively.
The configuration register example of Table 1 can be used in connection with the methods described herein such that at run time, if the graphics driver detects a case where the graphics engine is not able to operate at its maximum turbo frequency, it can issue a request to the PCU to update the configuration register or registers accordingly to lower the frequency range at which the core domain is allowed to run. In this way, the driver can dynamically update a frequency range for the core domain such that additional power headroom for the graphics domain can be obtained.
Referring now to
Referring now to
In various embodiments, power control unit 355 may include a frequency control logic 359, which may be a logic to control frequency of different domains of the processor. In the embodiment of
With further reference to
Referring now to
In general, each core 410 may further include low level caches in addition to various execution units and additional processing elements. In turn, the various cores may be coupled to each other and to a shared cache memory formed of a plurality of units of a last level cache (LLC) 4400-440n. In various embodiments, LLC 440 may be shared amongst the cores and the graphics engine, as well as various media processing circuitry. As seen, a ring interconnect 430 thus couples the cores together, and provides interconnection between the cores, graphics domain 420 and system agent circuitry 450.
In the embodiment of
As further seen in
Embodiments may be implemented in many different system types. Referring now to
Still referring to
Furthermore, chipset 590 includes an interface 592 to couple chipset 590 with a high performance graphics engine 538, by a P-P interconnect 539. In turn, chipset 590 may be coupled to a first bus 516 via an interface 596. As shown in
Embodiments may be implemented in code and may be stored on a non-transitory storage medium having stored thereon instructions which can be used to program a system to perform the instructions. The storage medium may include, but is not limited to, any type of disk including floppy disks, optical disks, solid state drives (SSDs), compact disk read-only memories (CD-ROMs), compact disk rewritables (CD-RWs), and magneto-optical disks, semiconductor devices such as read-only memories (ROMs), random access memories (RAMs) such as dynamic random access memories (DRAMs), static random access memories (SRAMs), erasable programmable read-only memories (EPROMs), flash memories, electrically erasable programmable read-only memories (EEPROMs), magnetic or optical cards, or any other type of media suitable for storing electronic instructions.
While the present invention has been described with respect to a limited number of embodiments, those skilled in the art will appreciate numerous modifications and variations therefrom. It is intended that the appended claims cover all such modifications and variations as fall within the true spirit and scope of this present invention.
This application is a continuation of U.S. patent application Ser. No. 13/782,578, filed, Mar. 1, 2013, which is a continuation of U.S. patent application Ser. No. 13/282,947, filed Oct. 27, 2011, now U.S. Pat. No. 9,026,815, issued May 5, 2015, the content of which is hereby incorporated by reference.
Number | Name | Date | Kind |
---|---|---|---|
5163153 | Cole et al. | Nov 1992 | A |
5287292 | Kenny et al. | Feb 1994 | A |
5522087 | Hsiang | May 1996 | A |
5590341 | Matter | Dec 1996 | A |
5621250 | Kim | Apr 1997 | A |
5931950 | Hsu | Aug 1999 | A |
6748546 | Mirov et al. | Jun 2004 | B1 |
6792392 | Knight | Sep 2004 | B1 |
6823516 | Cooper | Nov 2004 | B1 |
6829713 | Cooper et al. | Dec 2004 | B2 |
6983389 | Filippo | Jan 2006 | B1 |
6996728 | Singh | Feb 2006 | B2 |
7010708 | Ma | Mar 2006 | B2 |
7043649 | Terrell | May 2006 | B2 |
7093147 | Farkas et al. | Aug 2006 | B2 |
7111179 | Girson et al. | Sep 2006 | B1 |
7146514 | Kaushik et al. | Dec 2006 | B2 |
7194643 | Gonzalez et al. | Mar 2007 | B2 |
7263457 | White et al. | Aug 2007 | B2 |
7272730 | Acquaviva et al. | Sep 2007 | B1 |
7412615 | Yokota et al. | Aug 2008 | B2 |
7434073 | Magklis | Oct 2008 | B2 |
7437270 | Song et al. | Oct 2008 | B2 |
7454632 | Kardach et al. | Nov 2008 | B2 |
7529956 | Stufflebeam | May 2009 | B2 |
7539885 | Ma | May 2009 | B2 |
7574321 | Kernahan et al. | Aug 2009 | B2 |
7596464 | Hermerding et al. | Sep 2009 | B2 |
7603577 | Yamaji et al. | Oct 2009 | B2 |
7624215 | Axford et al. | Nov 2009 | B2 |
7730340 | Hu et al. | Jun 2010 | B2 |
7752467 | Tokue | Jul 2010 | B2 |
8370551 | Ohmacht et al. | Mar 2013 | B2 |
8407319 | Chiu et al. | Mar 2013 | B1 |
8601288 | Brinks et al. | Dec 2013 | B2 |
9176565 | Ananthakrishnan | Nov 2015 | B2 |
20010044909 | Oh et al. | Nov 2001 | A1 |
20020194509 | Plante et al. | Dec 2002 | A1 |
20030061383 | Zilka | Mar 2003 | A1 |
20040030940 | Espinoza-Ibarra et al. | Feb 2004 | A1 |
20040064752 | Kazachinsky et al. | Apr 2004 | A1 |
20040098560 | Storvik et al. | May 2004 | A1 |
20040139356 | Ma | Jul 2004 | A1 |
20040268166 | Farkas et al. | Dec 2004 | A1 |
20050022038 | Kaushik et al. | Jan 2005 | A1 |
20050033881 | Yao | Feb 2005 | A1 |
20050046400 | Rotem | Mar 2005 | A1 |
20050132238 | Nanja | Jun 2005 | A1 |
20050223258 | Watts | Oct 2005 | A1 |
20050288886 | Therien et al. | Dec 2005 | A1 |
20060006166 | Chen et al. | Jan 2006 | A1 |
20060041766 | Adachi | Feb 2006 | A1 |
20060050670 | Hillyard et al. | Mar 2006 | A1 |
20060053326 | Naveh et al. | Mar 2006 | A1 |
20060059286 | Bertone et al. | Mar 2006 | A1 |
20060069936 | Lint et al. | Mar 2006 | A1 |
20060117202 | Magklis et al. | Jun 2006 | A1 |
20060184287 | Belady et al. | Aug 2006 | A1 |
20070005995 | Kardach et al. | Jan 2007 | A1 |
20070016817 | Albonesi et al. | Jan 2007 | A1 |
20070033425 | Clark | Feb 2007 | A1 |
20070079294 | Knight | Apr 2007 | A1 |
20070101174 | Tsukimori et al. | May 2007 | A1 |
20070106428 | Omizo et al. | May 2007 | A1 |
20070106827 | Boatright et al. | May 2007 | A1 |
20070156992 | Jahagirdar | Jul 2007 | A1 |
20070168151 | Kernahan et al. | Jul 2007 | A1 |
20070214342 | Newburn | Sep 2007 | A1 |
20070239398 | Song et al. | Oct 2007 | A1 |
20070245163 | Lu et al. | Oct 2007 | A1 |
20070260895 | Aguilar et al. | Nov 2007 | A1 |
20080028240 | Arai et al. | Jan 2008 | A1 |
20080028778 | Millet | Feb 2008 | A1 |
20080077282 | Hartman et al. | Mar 2008 | A1 |
20080077813 | Keller et al. | Mar 2008 | A1 |
20080104425 | Gunther et al. | May 2008 | A1 |
20080136397 | Gunther et al. | Jun 2008 | A1 |
20080250260 | Tomita | Oct 2008 | A1 |
20080307240 | Dahan et al. | Dec 2008 | A1 |
20090006871 | Liu et al. | Jan 2009 | A1 |
20090150695 | Song et al. | Jun 2009 | A1 |
20090150696 | Song et al. | Jun 2009 | A1 |
20090158061 | Schmitz et al. | Jun 2009 | A1 |
20090158067 | Bodes et al. | Jun 2009 | A1 |
20090172375 | Rotem et al. | Jul 2009 | A1 |
20090172428 | Lee | Jul 2009 | A1 |
20090235105 | Branover et al. | Sep 2009 | A1 |
20090235108 | Gold et al. | Sep 2009 | A1 |
20090271141 | Coskun et al. | Oct 2009 | A1 |
20090271646 | Talwar et al. | Oct 2009 | A1 |
20100058078 | Branover et al. | Mar 2010 | A1 |
20100115309 | Carvalho et al. | May 2010 | A1 |
20100146513 | Song | Jun 2010 | A1 |
20100191997 | Dodeja et al. | Jul 2010 | A1 |
20100250856 | Owen et al. | Sep 2010 | A1 |
20110022865 | Gunther et al. | Jan 2011 | A1 |
20110072429 | Celeskey et al. | Mar 2011 | A1 |
20110093733 | Kruglick | Apr 2011 | A1 |
20110154090 | Dixon et al. | Jun 2011 | A1 |
20110191607 | Gunther et al. | Aug 2011 | A1 |
20110283124 | Branover et al. | Nov 2011 | A1 |
20120053897 | Naffziger | Mar 2012 | A1 |
20120066535 | Naffziger | Mar 2012 | A1 |
20120079290 | Kumar | Mar 2012 | A1 |
20120096288 | Bates et al. | Apr 2012 | A1 |
20120110352 | Branover et al. | May 2012 | A1 |
20120114010 | Branch | May 2012 | A1 |
20120116599 | Arndt et al. | May 2012 | A1 |
20120173907 | Moses et al. | Jul 2012 | A1 |
20120246506 | Knight | Sep 2012 | A1 |
20130061064 | Ananthakrishnan et al. | Mar 2013 | A1 |
20130080795 | Sistla et al. | Mar 2013 | A1 |
20130080803 | Ananthakrishnan et al. | Mar 2013 | A1 |
20130080804 | Ananthakrishnan et al. | Mar 2013 | A1 |
20130111120 | Ananthakrishnan et al. | May 2013 | A1 |
20130111121 | Ananthakrishnan et al. | May 2013 | A1 |
20130111226 | Ananthakrishnan et al. | May 2013 | A1 |
20130111236 | Ananthakrishnan et al. | May 2013 | A1 |
20130246825 | Shannon | Sep 2013 | A1 |
20130346774 | Bhandaru et al. | Dec 2013 | A1 |
20140068290 | Bhandaru et al. | Mar 2014 | A1 |
20140195829 | Bhandaru et al. | Jul 2014 | A1 |
20140208141 | Bhandaru et al. | Jul 2014 | A1 |
Number | Date | Country |
---|---|---|
101351759 | Jan 2009 | CN |
101403944 | Apr 2009 | CN |
101010655 | May 2010 | CN |
1 282 030 | May 2003 | EP |
201040701 | Nov 2010 | TW |
I342498 | May 2011 | TW |
201120628 | Jun 2011 | TW |
I344793 | Jul 2011 | TW |
Entry |
---|
Intel Developer Forum, IDF2010, Opher Kahn, et al., “Intel Next Generation Microarchitecture Codename Sandy Bridge: New Processor Innovations,” Sep. 13, 2010, 58 pages. |
SPEC-Power and Performance, Design Overview V1.10, Standard Performance Information Corp., Oct. 21, 2008, 6 pages. |
Intel Technology Journal, “Power and Thermal Management in the Intel Core Duo Processor,” May 15, 2006, pp. 109-122. |
Anoop Iyer, et al., “Power and Performance Evaluation of Globally Asynchronous Locally Synchronous Processors,” 2002, pp. 1-11. |
Greg Semeraro, et al., “Hiding Synchronization Delays in a GALS Processor Microarchitecture,” 2004, pp. 1-13. |
Joan-Manuel Parcerisa, et al., “Efficient Interconnects for Clustered Microarchitectures,” 2002, pp. 1-10. |
Grigorios Magklis, et al., “Profile-Based Dynamic Voltage and Frequency Scalling for a Multiple Clock Domain Microprocessor,” 2003, pp. 1-12. |
Greg Semeraro, et al., “Dynamic Frequency and Voltage Control for a Multiple Clock Domain Architecture,” 2002, pp. 1-12. |
Greg Semeraro, “Energy-Efficient Processor Design Using Multiple Clock Domains with Dynamic Voltage and Frequency Scaling,” 2002, pp. 29-40. |
Diana Marculescu, “Application Adaptive Energy Efficient Clustered Architectures,” 2004, pp. 344-349. |
L. Benini, et al., “System-Level Dynamic Power Management,” 1999, pp. 23-31. |
Ravindra Jejurikar, et al., “Leakage Aware Dynamic Voltage Scaling for Real-Time Embedded Systems,” 2004, pp. 275-280. |
Ravindra Jejurikar, et al., “Dynamic Slack Reclamation With Procrastination Scheduling in Real-Time Embedded Systems,” 2005, pp. 13-17. |
R. Todling, et al., “Some Strategies for Kalman Filtering and Smoothing,” 1996, pp. 1-21. |
R.E. Kalman, “A New Approach to Linear Filtering and Prediction Problems,” 1960, pp. 1-12. |
Intel Corporation, “Intel 64 and IA-32 Architectures Software Developer's Manual,” vol. 3 (3A, 3B & 3C): System Programming Guide, Feb. 2014, Chapter 14 Power and Thermal Management (14.1-14.9.5), 44 pages. |
U.S. Patent and Trademark Office, Office Action dated Jul. 31, 2014, in U.S. Appl. No. 13/247,564. |
U.S. Patent and Trademark Office, Office Action dated Jan. 16, 2014, with Reply filed Apr. 9, 2014, in U.S. Appl. No. 13/247,564. |
U.S. Patent and Trademark Office, Office Action dated Jun. 6, 2014, with Reply filed Sep. 4, 2014, in U.S. Appl. No. 13/282,947. |
U.S. Patent and Trademark Office, Office Action dated May 16, 2014, with Reply filed Aug. 12, 2014, in U.S. Appl. No. 13/285,414. |
U.S. Patent and Trademark Office, Final Office Action dated May 14, 2014, with Request for Continued Examination filed Aug. 13, 2014, in U.S. Appl. No. 13/247,580. |
U.S. Patent and Trademark Office, Office Action dated Aug. 18, 2014, in U.S. Appl. No. 13/285,465. |
Number | Date | Country | |
---|---|---|---|
20160026229 A1 | Jan 2016 | US |
Number | Date | Country | |
---|---|---|---|
Parent | 13782578 | Mar 2013 | US |
Child | 14875930 | US | |
Parent | 13282947 | Oct 2011 | US |
Child | 13782578 | US |