Advances in semiconductor processing and logic design have permitted an increase in the amount of logic that may be present on integrated circuit devices. As a result, computer system configurations have evolved from a single or multiple integrated circuits in a system to multiple hardware threads, multiple cores, multiple devices, and/or complete systems on individual integrated circuits. Additionally, as the density of integrated circuits has grown, the power requirements for computing systems (from embedded systems to servers) have also escalated. Furthermore, software inefficiencies, and its requirements of hardware, have also caused an increase in computing device energy consumption. In fact, some studies indicate that computing devices consume a sizeable percentage of the entire electricity supply for a country, such as the United States of America. As a result, there is a vital need for energy efficiency and conservation associated with integrated circuits. These needs will increase as servers, desktop computers, notebooks, ultrabooks, tablets, mobile phones, processors, embedded systems, etc. become even more prevalent (from inclusion in the typical computer, automobiles, and televisions to biotechnology).
In many computing environments, it is an established fact that for much of the time, computing systems such as servers are operating well below their peak performance level. During these periods of low utilization the focus is on saving as much power as possible in order to reduce the energy costs. Power management technologies can deliver significant power savings during periods of low utilization. However any power management technology involves a power/performance tradeoff.
Due to increasing integration, many processors can include power management technologies which can control up ⅔ rds of total platform power. In many cases these technologies are controlled by a power control unit (PCU) in the processor. Each power management feature is specifically tuned in design to achieve an optimal power/performance tradeoff. At the time of tuning, there is little knowledge of the actual workload and usage pattern for the system in the field. Given this lack of knowledge, the tuning process is conservative and is necessarily biased towards losing as little performance as possible. This approach prevents significant power savings for an end user who is willing to tolerate more performance loss in return for power savings.
Thus typically power management features are statically tuned to tolerate very little performance loss. This results in several negative downsides. First, at low utilizations where an end user can tolerate high performance loss, available power savings are not realized. Second, an end user typically has no choice regarding power/performance tradeoffs, other than default profiles provided by an operating system (OS). Given the complexity involved in tuning power management features, end users rarely venture into tuning individual features for their target usage, and thus the potential benefit of the features are often not realized.
Embodiments provide a so-called energy performance bias (EPB) as an architectural feature. Control of this parameter allows for a simple high level input from an end user to indicate a power/performance tradeoff preference from the end user. This input can be used to provide multiple tuning levels with different points of power and performance tradeoff. By associating this energy performance bias with direct user input, embodiments enable the end user to directly control power/performance tradeoff in a simple manner. As used herein the terms “end user” or “user” are comprehended to include computer users of varying degrees, including technical and non-technical users, information technology (IT) personnel, data center personnel and so forth.
Thus instead of providing complete tuning flexibility for each power management technology and allowing an end user to tune each feature, a single input can be provided by the user to control these different features. The EPB value may thus correspond to a single input value to control a plurality of power management features. Furthermore, understand that the provision of the EPB value can be from a variety of external entities including but not limited to an operating system (OS), a basic input/output system (BIOS), an external embedded controller of a platform such as a baseboard management controller (BMC), a data center central management software and communicated via a network and a node manager device or so forth to a platform, among others, automatically or via a user. And in some embodiments, the end user may be prevented from such individual control. As such, the inherent difficulty in exposing all of a large number of power management features to the end user can be avoided, particularly as most end users have little or no knowledge as to how to tune such individual features. In practice, 90% of server users never change the default power management configuration for the server, and a very similar dilemma exists in client usage models as well.
Table 1 below shows a list of power management features available in different processor models and which can be controlled globally using one or more EPB inputs in accordance with an embodiment of the present invention. It can be seen from Table 1 that the number features is disparate and large. This list continues to grow with each generation of processor. As seen, power management features can be performed within a processor itself, a memory or memory interconnect, or other interconnect structures. Embodiments can provide for global control of a plurality of power management features of a processor including, in some embodiments, those shown in Table 1. However, understand that this listing is exemplary only, and other power management features can be controlled using an EPB value in accordance with an embodiment of the present invention.
A power/performance loadline is a well accepted representation of power consumption of a system at different utilizations. This loadline represents the power consumed at each delivered performance level and takes into account the performance impact of a power management feature. Via an energy performance bias setting in accordance with an embodiment of the present invention, tuning of a power performance loadline can be realized. This setting or slider thus allows an end user to choose a range of choices between performance-oriented tuning settings and power saving-oriented tuning settings. Each EPB value can be mapped to a corresponding level of tradeoff between power and performance.
Although the following embodiments are described with reference to energy conservation and energy efficiency in specific integrated circuits, such as in computing platforms or processors, other embodiments are applicable to other types of integrated circuits and logic devices. Similar techniques and teachings of embodiments described herein may be applied to other types of circuits or semiconductor devices that may also benefit from better energy efficiency and energy conservation. For example, the disclosed embodiments are not limited to any particular type of computer systems, and may be also used in other devices, such as handheld devices, systems on chip (SoCs), and embedded applications. Some examples of handheld devices include cellular phones, Internet protocol devices, digital cameras, personal digital assistants (PDAs), and handheld PCs. Embedded applications typically include a microcontroller, a digital signal processor (DSP), network computers (NetPC), set-top boxes, network hubs, wide area network (WAN) switches, or any other system that can perform the functions and operations taught below. Moreover, the apparatus', methods, and systems described herein are not limited to physical computing devices, but may also relate to software optimizations for energy conservation and efficiency. As will become readily apparent in the description below, the embodiments of methods, apparatus', and systems described herein (whether in reference to hardware, firmware, software, or a combination thereof) are vital to a ‘green technology’ future, such as for power conservation and energy efficiency in products that encompass a large portion of the US economy.
Referring now to
In one usage model, where an operating system supports multiple power/performance profiles, the OS configures the energy performance bias register to an appropriate value based the power/performance profile chosen by an administrator which is one example of an end user. To this end, embodiments may provide a user interface to request and receive user selection of a given EPB value. This user interface functionality can be accommodated within BIOS or an OS or other custom system level software. In an advanced usage model, a data center administrator or other information technology personnel can provide the input based on time of day policies. For example, during times of peak usage, the administrator can choose to configure a value of 0 while during non-peak hours the administrator can choose to configure the server to a value of 15 to save as much power as possible.
In another usage model for advanced application level control that supports the option for application monitoring of the level of performance that the application expects to receive under a level of service agreement, the application may tune through an OS service the level of performance loss that is acceptable by the application under its current level of operation.
As seen in
Still referring to
Still referring to
Referring now to
As further seen in table 150, each entry includes a plurality of fields. In the embodiment shown in
Based on the bin value, a respective column of the table can be accessed and all of the fields of the accessed column can be read out, e.g., as a single vector read or iteratively via a field of an entry per cycle. As will be discussed below, these values output from the table can be provided to a change detector 160. Although shown at this high level in the embodiment of
In some embodiments, in addition to the EPB input, a workload configuration input can be provided. To this end, the table can have 3 dimensions such that based on the workload configuration input, a different set of entries for the defined power management features can be accessed, as different values may be present in the table for different workload configurations. By this workload configuration input a vertical user having an understanding and control of the exact workloads running on their systems can benefit from well-tuned settings. As examples, a user can configure a workload input as non-uniform memory architecture (NUMA), uniform memory architecture (UMA), input/output (I/O) intensive, etc. This input allows for choosing tuning settings that favor a specific workload pattern. For example, if the workload is NUMA, aggressive settings can be applied to off-chip interconnects such as Intel® Quick Path Interconnect (QPI) links to save as much power possible while causing very little performance impact, as off-chip accesses can be expected to be low. Thus for embodiments in which a workload configuration input is provided, it can be used as an additional input to access the table.
Once a new set of configurations is determined from access to the tuning table, the target features can be updated. There are two classes of configurations. For features like a power C-state auto demotion and turbo upside clipping, the configurations can be internal to power management code, such as present in firmware of the PCU. In this case the update can be effected by loading new values into an internal feature specific data structure such as one or more configuration registers. In a second class of configuration, the configuration value is specific to an entity that resides outside the PCU. For example, a clock related configuration can be implemented in the memory controller. In this case, the PCU can initiate a series of writes to target configuration space to update the settings. In order to minimize the number of writes, each configuration value can be compared to the previous value and a write is issued only if the actual configuration value changes.
Accordingly, with further reference back to
If instead the power management feature is for a non-PCU controlled feature, update messages can be sent to the destination, e.g., via one or more write messages including the information. Although shown with this particular implementation in the embodiment of
Referring now to
Still referring to
If instead a bin change has occurred, control passes to block 250 where a power-performance table can be accessed based on the bin. This table access thus can be used to read out for the given bin, and for each of multiple power management features, one or more settings or other values used to control the power management feature. This reading and the updating process can be performed iteratively for each feature. Thus it can be determined whether an additional power management features present in the table (diamond 255). If not, method 200 terminates for that evaluation interval. If another feature is present, it can be determined whether a change in the feature settings has occurred since the last read out from the table (diamond 260). If not control passes back to diamond 255. If instead a feature change has occurred, control passes to diamond 270 where it can be determined whether the feature is a PCU-controlled feature. If so, one or more settings for the feature can be updated in the PCU (block 290). Such updating can be via configuration register updates or so forth. Otherwise if the updates are for a power management feature controlled by an external agent, control passes to block 280 where one or more messages, e.g., write messages can be sent to the destination agent to update the settings accordingly. For example, writes can be issued to target using a given message channel (e.g., QPI, integrated memory controller, a Peripheral Component Interconnect Express (PCI Express™ (PCIe™)) link or so forth). Although shown with this particular implementation in the embodiment of
In one embodiment, a tuning methodology for generating to power-performance table may include the following. First, each individual power management feature is tuned separately while the other features are turned off. This tuning includes generating a power/performance profile across a range of workloads for each feature. As an example various benchmark workloads can be executed to generate the profile. Second, multiple features can be enabled to a profile whether the power/performance tradeoff met the tuning goals for overall power/performance profile. Then individual feature tunings can be adjusted to meet overall power/performance profile goals. The above steps can be repeated until overall goals are reached.
By default the operating system may populate the EPB input with a profile chosen by the end user. This tuning can be achieved via an iterative process in which different EPB values are input and workloads run with each setting. As the EPB input is tuned towards power savings, a user may notice a continuous degradation of response time (performance). Once a desired response time is met, the EPB value may correspond to a desired maximum power savings for the user's specific usage. This in essence is the goal of the tunable power performance loadline, as each end user can tune the power/performance loadline to his specific usage.
Embodiments can be implemented in processors for various markets including server processors, desktop processors, mobile processors, and so forth. Referring now to
In various embodiments, power control unit 355 may include a policy tuning logic 359, which may be a logic to perform dynamic control of power management settings based on an input EPB value. As further seen, a tuning table 357 can be present to store power management feature settings. This tuning table generated, e.g., during design of the processor, based on benchmark workload testing on the processor may be used to determine appropriate settings for the different power management policies that map to EPB values or bins.
With further reference to
Referring now to
In general, each core 410 may further include low level caches in addition to various execution units and additional processing elements. In turn, the various cores may be coupled to each other and to a shared cache memory formed of a plurality of units of a last level cache (LLC) 4400-440n. In various embodiments, LLC 440 may be shared amongst the cores and the graphics engine, as well as various media processing circuitry. As seen, a ring interconnect 430 thus couples the cores together, and provides interconnection between the cores, graphics domain 420 and system agent circuitry 450. In one embodiment, interconnect 430 can be part of the core domain. However in other embodiments the ring interconnect can be of its own domain.
As further seen, system agent domain 450 may include a display controller 452 which may provide control of and an interface to an associated display. As further seen, system agent domain 450 may include a power control unit 455 which can include a policy tuning logic 459 in accordance with an embodiment of the present invention to dynamically control power management settings obtained from a tuning table 457 based on one or more EPB values. In various embodiments, this logic may execute the algorithm described above in
As further seen in
Embodiments may be implemented in many different system types. Referring now to
Still referring to
Furthermore, chipset 590 includes an interface 592 to couple chipset 590 with a high performance graphics engine 538, by a P-P interconnect 539. In turn, chipset 590 may be coupled to a first bus 516 via an interface 596. As shown in
Referring now to
Embodiments may be implemented in code and may be stored on a non-transitory storage medium having stored thereon instructions which can be used to program a system to perform the instructions. The storage medium may include, but is not limited to, any type of disk including floppy disks, optical disks, solid state drives (SSDs), compact disk read-only memories (CD-ROMs), compact disk rewritables (CD-RWs), and magneto-optical disks, semiconductor devices such as read-only memories (ROMs), random access memories (RAMs) such as dynamic random access memories (DRAMs), static random access memories (SRAMs), erasable programmable read-only memories (EPROMs), flash memories, electrically erasable programmable read-only memories (EEPROMs), magnetic or optical cards, or any other type of media suitable for storing electronic instructions.
While the present invention has been described with respect to a limited number of embodiments, those skilled in the art will appreciate numerous modifications and variations therefrom. It is intended that the appended claims cover all such modifications and variations as fall within the true spirit and scope of this present invention.
This application is a continuation of U.S. patent application Ser. No. 14/855,553, filed Sep. 16, 2015, now U.S. Pat. No. 9,535,487, issued Jan. 3, 2017, which is a continuation of U.S. patent application Ser. No. 13/782,473, filed Mar. 1, 2013, now U.S. Pat. No. 9,170,624, issued Oct. 27, 2015, which is a continuation of U.S. patent application Ser. No. 13/326,586, filed Dec. 15, 2011, now U.S. Pat. No. 9,098,261, issued Aug. 4, 2015, the content of which is hereby incorporated by reference.
Number | Name | Date | Kind |
---|---|---|---|
5163153 | Cole et al. | Nov 1992 | A |
5522087 | Hsiang | May 1996 | A |
5590341 | Matter | Dec 1996 | A |
5621250 | Kim | Apr 1997 | A |
5931950 | Hsu | Aug 1999 | A |
6748546 | Mirov et al. | Jun 2004 | B1 |
6792392 | Knight | Sep 2004 | B1 |
6823516 | Cooper | Nov 2004 | B1 |
6829713 | Cooper et al. | Dec 2004 | B2 |
6996728 | Singh | Feb 2006 | B2 |
7010708 | Ma | Mar 2006 | B2 |
7043649 | Terrell | May 2006 | B2 |
7093147 | Farkas et al. | Aug 2006 | B2 |
71111179 | Girson et al. | Sep 2006 | |
7194643 | Gonzalez et al. | Mar 2007 | B2 |
7272730 | Acquaviva et al. | Sep 2007 | B1 |
7376849 | Tschanz et al. | May 2008 | B2 |
7412616 | Yokota et al. | Aug 2008 | B2 |
7434073 | Magklis | Oct 2008 | B2 |
7437270 | Song et al. | Oct 2008 | B2 |
7454632 | Kardach et al. | Nov 2008 | B2 |
7529956 | Stufflebeam | May 2009 | B2 |
7539885 | Ma | May 2009 | B2 |
7730340 | Hu et al. | Jun 2010 | B2 |
8245070 | Finkelstein | Aug 2012 | B2 |
8301742 | Hanson et al. | Oct 2012 | B2 |
8904205 | Burns | Dec 2014 | B2 |
20010044909 | Oh et al. | Nov 2001 | A1 |
20020194509 | Plante et al. | Dec 2002 | A1 |
20030061383 | Zilka | Mar 2003 | A1 |
20040064752 | Kazachinsky et al. | Apr 2004 | A1 |
20040098560 | Storvik et al. | May 2004 | A1 |
20040139356 | Ma | Jul 2004 | A1 |
20040268166 | Farkas et al. | Dec 2004 | A1 |
20050022038 | Kaushik et al. | Jan 2005 | A1 |
20050033881 | Yao | Feb 2005 | A1 |
20050068206 | Beers et al. | Mar 2005 | A1 |
20050132238 | Nanja | Jun 2005 | A1 |
20060050670 | Hillyard et al. | Mar 2006 | A1 |
20060059286 | Bertone et al. | Mar 2006 | A1 |
20060063326 | Naveh | Mar 2006 | A1 |
20060069936 | Lint et al. | Mar 2006 | A1 |
20060117202 | Magklis et al. | Jun 2006 | A1 |
20060184287 | Belady et al. | Aug 2006 | A1 |
20070005995 | Kardach et al. | Jan 2007 | A1 |
20070016817 | Albonesi et al. | Jan 2007 | A1 |
20070079294 | Knight | Apr 2007 | A1 |
20070106827 | Boatright et al. | May 2007 | A1 |
20070156992 | Jahagirdar | Jul 2007 | A1 |
20070214342 | Newburn | Sep 2007 | A1 |
20070239398 | Song et al. | Oct 2007 | A1 |
20070245163 | Lu et al. | Oct 2007 | A1 |
20080028240 | Arai et al. | Jan 2008 | A1 |
20080260260 | Tomita | Oct 2008 | A1 |
20080288894 | Han et al. | Nov 2008 | A1 |
20080289369 | Noguchi | Nov 2008 | A1 |
20090006871 | Liu et al. | Jan 2009 | A1 |
20090150695 | Song et al. | Jun 2009 | A1 |
20090150696 | Song et al. | Jun 2009 | A1 |
20090158061 | Schmitz et al. | Jun 2009 | A1 |
20090158067 | Bodas et al. | Jun 2009 | A1 |
20090172375 | Rotem et al. | Jul 2009 | A1 |
20090172428 | Lee | Jul 2009 | A1 |
20090177334 | Artman et al. | Jul 2009 | A1 |
20090235105 | Branover et al. | Sep 2009 | A1 |
20090254660 | Hanson | Oct 2009 | A1 |
20090327785 | Chang et al. | Dec 2009 | A1 |
20100115309 | Carvalho et al. | May 2010 | A1 |
20100146513 | Song | Jun 2010 | A1 |
20100169609 | Finkelstein | Jul 2010 | A1 |
20100191997 | Dodeja et al. | Jul 2010 | A1 |
20100199280 | Vestal | Aug 2010 | A1 |
20100218029 | Floyd et al. | Aug 2010 | A1 |
20110154090 | Dixon et al. | Jun 2011 | A1 |
20110296212 | Elnozahy et al. | Dec 2011 | A1 |
20130179706 | Sistla et al. | Jul 2013 | A1 |
Number | Date | Country |
---|---|---|
1728023 | Feb 2006 | CN |
1 282 030 | May 2003 | EP |
200301025 | Jan 2006 | TW |
201001157 | Jan 2010 | TW |
Entry |
---|
U.S. Appl. No. 12/889,121, “Providing Per Core Voltage and Frequency Control,” filed Sep. 23, 2010, by Pakaj Kumar. |
SPEC-Power and Performance, Design Overview V1.10, Standard Performance Information Corp., Oct. 21, 2008, 6 pages. |
U.S. Appl. No. 13/070,700, “Obtaining Power Profile Information With Low Overhead,” filed Mar. 24, 2011, by Robert Knight. |
Anoop Iyer, et al., “Power and Performance Evaluation of Globally Asynchronous Locally Synchronous Processors,” 2002, pp. 1-11. |
Greg Semeraro, et al., “Hiding Synchronization Delays In A GALS Processor Microarchitecture,” 2004, pp. 1-13. |
Joan-Manuel Parcerisa, et al., “Efficient All References Considered Interconnects for Clustered Microarchitectures,” 2002, pp. 1-10. |
Grigorios Magklis, et al., “Profile-Based Dynamic Voltage and Frequency Scalling for a Multiple Clock Domain Microprocessor,” 2003, pp. 1-12. |
Greg Semeraro, et al., “Dynamic Frequency and Voltage Control for a Multiple Clock Domain Architecture,” 2002, pp. 1-12. |
Greg Semeraro, “Energy-Efficient Processor Design Using Multiple Clock Domains with Dynamic Voltage and Frequency Scaling,” 2002, pp. 29-40. |
Diana Marculescu, “Application Adaptive Energy Efficient Clustered Architectures,” 2004, pp. 344-349. |
L. Benini, et al., “System-Level Dynamic Power Management,” 1999, pp. 23-31. |
Ravindra Jejurikar, et al., “Leakage Aware Dynamic Voltage Scaling for Real-Time Embedded Systems,” 2004, pp. 275-280. |
Ravindra Jejurikar, et al., “Dynamic Slack Reclamation With Procrastination Scheduling In Real-Time Embedded Systems,” 2005, pp. 13-17. |
R. Todling, et al., “Some Strategies for Kalman Filtering and Smoothing,” 1996, pp. 1-21. |
R.E. Kalman, “A New Approach to Linear Filtering and Prediction Problems,” 1960, pp. 1-12. |
Intel Technology Journal, “Power and Thermal Management in the Intel Core Duo Processor,” May 16, 2006, pp. 109-122. |
Intel Developer Forum, IDF2010, Opher Kahn, et al., “Intel Next Generation Microarchitecture Codename Sandy Bridge: New Processor Innovations,” Sep. 13, 2010, 56 pages. |
International Searching Authority, “Notification of Transmittal of the International Search Report and the Written Opinion of the International Searching Authority,” dated Apr. 30, 2013, in International application No. PCT/US2012/069578. |
Taiwan Patent Office, Office Action dated Aug. 25, 2014, in Taiwan Apllication No. 101141477. |
China Patent Office, Office Action dated Oct. 28, 2015 in Chinese Patent Application No. 201280061735.4. |
State Intellectual Property Office of the People's Republic of China, Third Office Action dated Nov. 15, 2016 in Chinese Patent Application No. 201280061735.4. |
Number | Date | Country | |
---|---|---|---|
20170083076 A1 | Mar 2017 | US |
Number | Date | Country | |
---|---|---|---|
Parent | 14855553 | Sep 2015 | US |
Child | 15367330 | US | |
Parent | 13782473 | Mar 2013 | US |
Child | 14855553 | US | |
Parent | 13326586 | Dec 2011 | US |
Child | 13782473 | US |