Advances in semiconductor processing and logic design have permitted an increase in the amount of logic that may be present on integrated circuit devices. As a result, computer system configurations have evolved from a single or multiple integrated circuits in a system to multiple hardware threads, multiple cores, multiple devices, and/or complete systems on individual integrated circuits. Additionally, as the density of integrated circuits has grown, the power requirements for computing systems (from embedded systems to servers) have also escalated. Furthermore, software inefficiencies, and its requirements of hardware, have also caused an increase in computing device energy consumption. In fact, some studies indicate that computing devices consume a sizeable percentage of the entire electricity supply for a country, such as the United States of America. As a result, there is a vital need for energy efficiency and conservation associated with integrated circuits. These needs will increase as servers, desktop computers, notebooks, ultrabooks, tablets, mobile phones, processors, embedded systems, etc. become even more prevalent (from inclusion in the typical computer, automobiles, and televisions to biotechnology).
In many computing environments, it is an established fact that for most times systems such as servers are operating well below their peak performance. During these periods of low utilization the focus is on saving as much power as possible in order to reduce energy costs. Power management technologies can deliver significant power savings during periods of low utilization. However every power management technology involves a power/performance tradeoff, especially during periods of high activity. A user would ideally like to save as much power as possible at low utilization while realizing maximum performance at times of high utilization.
Users who cannot tolerate performance loss at high utilizations typically tune power management features for a performance policy. This implies that when the server is underutilized, it will consume more power than optimal. Users who like to save power at low utilizations typically tune power management features for a power saver policy. This implies that when the server is highly utilized, the highest performance of the server may not be realized. However, at low server utilizations where the end user can typically tolerate higher performance loss, available power savings are not realized.
Embodiments provide a mechanism for detecting periods of low/medium utilization of a processor such as a multicore processor and responsive to this detection, tuning power management features to save as much power as possible. At the same, the mechanism can detect periods of high utilization of the processor and tune power management features to minimize performance loss. More specifically, a dynamic loadline tuning architecture is provided to enable this mechanism.
A power performance loadline is a representation of power consumption of a computer system such as a server at different utilizations. This loadline thus represents the power consumed at each delivered performance level and takes into account the performance impact of a power management feature.
In a conventional system, these different loadlines are statically realized by a static configuration of the system. As an example, an operating system (OS) can provide a static selection of one of these three policies, which can be configured by an end user. Instead, embodiments provide a technique to dynamically and automatically switch between power biased settings and performance biased settings. In this way power can be saved at low utilizations while preserving performance at high utilizations. More specifically, embodiments provide a power performance loadline that follows power saving policy 20 at low utilization and instead dynamically moves to performance policy 10 at high utilization via a dynamic loadline switch 25.
In various embodiments, a dynamic loadline tuning algorithm can be used to detect a utilization point at which the maximum performance level achievable for the current power/performance tuning is reached and switch the policy towards a performance biased tuning. This maximum performance level can be configurable in various embodiments. As an example based on utilization, dynamic loadline tuning can save anywhere between approximately 8 watts (W) to 30 W and between 40% and 70% utilization. It can also be seen that at utilization nearing 100%, the performance level matches a performance policy. Embodiments thus maximize power savings at low utilization while maximizing performance at high utilization.
Although the following embodiments are described with reference to energy conservation and energy efficiency in specific integrated circuits, such as in computing platforms or processors, other embodiments are applicable to other types of integrated circuits and logic devices. Similar techniques and teachings of embodiments described herein may be applied to other types of circuits or semiconductor devices that may also benefit from better energy efficiency and energy conservation. For example, the disclosed embodiments are not limited to any particular type of computer systems, and may be also used in other devices, such as handheld devices, systems on chip (SoCs), and embedded applications. Some examples of handheld devices include cellular phones, Internet protocol devices, digital cameras, personal digital assistants (PDAs), and handheld PCs. Embedded applications typically include a microcontroller, a digital signal processor (DSP), network computers (NetPC), set-top boxes, network hubs, wide area network (WAN) switches, or any other system that can perform the functions and operations taught below. Moreover, the apparatus', methods, and systems described herein are not limited to physical computing devices, but may also relate to software optimizations for energy conservation and efficiency. As will become readily apparent in the description below, the embodiments of methods, apparatus', and systems described herein (whether in reference to hardware, firmware, software, or a combination thereof) are vital to a ‘green technology’ future, such as for power conservation and energy efficiency in products that encompass a large portion of the US economy.
Referring now to
As further seen method 100 continues by determining an amount of time that the cores of the processor spend in an active state during the evaluation interval (block 120). It is likely that the amount of time spent in an active state is greater than the amount of time spent in a maximum performance state, since it is likely that at least one of the cores for at least some amount of the time of the evaluation interval is in an active state, but not a maximum performance state.
Still referring to
Embodiments may leverage this ratio information to determine whether to dynamically switch between power management policies. Specifically as seen in
Still referring to
As demanded performance of a system increases, the operating system requests higher performance states (known as P-states). At a given performance state, if cores become idle, they enter into idle states (known as C-states). These states can be in accordance with an OS-based mechanism, namely the Advanced Configuration and Platform Interface (ACPI) standard (e.g., Rev. 3.0b, published Oct. 10, 2006). According to ACPI, a processor can operate at various power and performance states. With regard to power states, ACPI specifies different power consumption states, generally referred to as so-called C1 to Cn states. When a core is active, it runs at a C0 state, and when the core is idle it may be placed in a core low power state, a so-called core non-zero C-state (e.g., C1-C6 states). In addition to these power states, a processor can further be configured to operate at one of multiple performance states, namely from P0 to PN. In general, the P1 performance state may correspond to the highest guaranteed performance state that can be requested by an OS. In addition to this P1 state, the OS can further request a higher performance state, namely a P0 state, which corresponds to a maximum performance state. This P0 state may thus be an opportunistic state in which, when power and/or thermal budget is available, processor hardware can configure the processor or at least portions thereof to operate at a higher than guaranteed frequency.
As the system utilization moves from medium to high levels, the average P-state increases and eventually reaches the maximum turbo state. At the same time the available idle periods reduce and the cores stay active for most of the time. In one embodiment, by detecting the point at which all cores are at the highest P-state for an entire evaluation interval, it can be concluded that the highest performance point for the current power/performance settings is reached.
For a given evaluation period T(Eval), the time spent by the ith core when it is in an active state (e.g., C0) at the highest P-state (Pmax) is represented as T(i,C0,Pmax), and can be referred to as a maximum active time.
The sum of this time across all cores (e.g., M cores) thus equals:
T(C0,Pmax)=ΣT(i,C0,Pmax).
The highest value of T(C0,Pmax) is equal to M*T(Eval). When T(C0,Pmax) reaches its highest value then the highest performance point for the current power/performance settings is reached. This is the principle used for a dynamic switching detector in accordance with an embodiment of the present invention.
Two additional considerations may be accounted for with regard to the detector. First, the detector can be made immune to instantaneous changes in performance. To this end, an exponential moving average of the total maximum active time metric can be calculated. The average computed for the Nth period is given below, where a represents an averaging constant.
Average T(C0,Pmax,N)=α*Average T(C0,Pmax,N−1)+(1−α)*T(C0,Pmax).
A second consideration is the fact the number of active cores changes from one workload to another and from one period to another. In order to ensure that the detector can accommodate such varying workloads, the following can be performed. In each period T(Eval), T(C0,i) represents the active time for the ith core. When summed across M cores this yields T(C0), and can be referred to as an active time. Now the average of T(C0) for the Nth period can be computed using the same method as the average for T(C0,Pmax):
Average T(C0,N)=α*Average T(C0,N−1)+(1−α)*T(C0).
Average T (C0, N) thus represents the average active time of all cores, and Average T (C0,Pmax,N) represents the average time for all cores when they are active and are the highest P-state.
At the highest performance point, both averages will become very close to each other. Thus the detector may determine an optimal switch point between performance and non-performance policies in part in accordance with the following:
Detector Ratio=Average T(C0,Pmax,N)/Average T(C0,N)
If the detector ratio is very close to 1 (e.g., between approximately 80% and 100%), then a switch to a performance policy can be performed. If instead the detector ratio is well below 1, then a switch to a non-performance policy can be made (either to a power saver or balanced policy, based on user configuration).
To ensure that the algorithm is stable, high and low thresholds, Threshold_High and Threshold_Low, can be defined. The detector thus determines the operating point, which can be used to dynamically set the appropriate policy. If the detector ratio is greater than Threshold_High, the policy is set to a performance policy. If the detector ratio is less than the Threshold_Low, the Policy is set to a non-performance policy, e.g., one of a power saver or balanced policy. Alternately, an OS or other software can also be given the flexibility of choosing Threshold_High and Threshold_Low values.
Sampler 230 provides a time duration in which the processor was in a maximum performance state during the evaluation interval to a moving average maximum performance processor 240. In addition, sampler 230 provides an active state value corresponding to a duration of time during the evaluation interval that the cores of the processor were in an active state to a moving average active state processor 250. As seen, these processors may further receive an alpha value, details of which will be discussed further below.
Based on this information, these processors can generate moving averages, which can be an average of the sampler outputs over a number of evaluation intervals. For example, although the scope of the present invention is not limited in this regard in some embodiments the moving average can be formed of evaluation intervals between approximately 5 milliseconds and 100 milliseconds. Thus these average duration values can be provided to a comparator and threshold detector 260. First, a ratio can be determined that corresponds the amount of time in a maximum performance state to the total time in an active state. This ratio can then be compared to one or more thresholds. Based on this comparison, a power management policy can be selected that can be the same as for the previous evaluation interval, or can be a dynamic switch, e.g., from a non-performance policy to a performance policy or vice-versa. Further details of the actual calculations performed will be described further below.
In various embodiments, maximum active state accumulator 210 generates the sum of total time spent in (C0, Pmax) state across all cores. This accumulator can be implemented in an event handler that is triggered every time there is a change in C-state and/or P-state of any core. The event handler maintains two variables, an entry time stamp per core for the (C0, Pmax) state and a state mask that represents whether a particular core was in the (C0, Pmax) state the last time the event handler was called. When the event handler is called, it loops across all cores and computes a current state mask. It compares the current state mask with the last state mask and detects whether a particular core exited (C0, Pmax) state or entered (C0, Pmax) state or remained in the same state as before. When an exit is detected from (C0, Pmax) state, the entry time stamp is subtracted from the current time stamp and the resulting value is added into the T (C0, Pmax) accumulator. If an entry into (C0, Pmax) is detected, then the current time stamp is stored in the last time stamp for that core. When these actions are completed across all cores, the accumulator contains the latest value of T (C0, Pmax).
In various embodiments, active state accumulator 220 generates the sum of total time spent in (C0) state across all cores. This accumulator can similarly be implemented in an event handler that is triggered every time there is a change in C-state and/or P-state of a core. The event handler maintains two variables, an entry time stamp per core for the (C0) state and a state mask that represents whether a particular core was in the (C0) state the last time event handler was called. When the event handler is called, it loops across all cores and computes a current state mask. It compares the current state mask with the last state mask and detects whether a particular core exited (C0) state or entered (C0) state or remained in the same state as before. When an exit is detected from the (C0) state, the entry time stamp is subtracted from the current time stamp and the resulting value is added into the T(C0) accumulator. If an entry into (C0) is detected, then the current time stamp is stored in the last time stamp for the core. When these actions are completed across all cores, the accumulator contains the latest value of T(C0).
In one embodiment, accumulator sampler 230 can sample the accumulators once every 1 ms. This sampling rate may be adjusted based on a desired response time. Once the accumulators are sampled, they can be reset to zero to allow accumulation for the next evaluation period. The sampled values can be stored into a given storage area such as a pair of registers to store T (C0, Pmax, N) and T (C0, N) respectively.
In turn, processors 240 and 250 perform an exponential moving average calculation by updating the average values using the latest sampled inputs of T(C0,Pmax, N) and T(C0,N). Processors 240 and 250 can generate the average immediately after the accumulator sampler generates the sampled values at the same rate (e.g., once every 1 ms), and thus can generate average T (C0, Pmax) and average T (C0).
In turn, a comparator 260 can be executed responsive to receipt of the average values. In comparator 260 a ratio of average T(C0,Pmax) and Average T(C0) can be computed. This ratio can then be compared to high and low thresholds. From this comparison, as described above a final determination of the operative policy can be generated.
Although the analysis with regard to determining durations in maximum performance and active states are described with reference to the circuit of
At diamond 330 it can be determined whether a change in a given core's state has occurred. If so control passes to diamond 335 where it can be determined whether this state change is an exit from the maximum performance and/or active states. If so, control passes to block 350 where the length of the core's residency in the maximum performance state and/or the active state can be determined. As will be discussed further, this determination can be based on time stamp information in some embodiments. This determined value then may be accumulated in the corresponding accumulator at block 360. Control then passes to diamond 365 where it can be determined whether additional cores are to be analyzed. If so, control passes to block 370 where the core number can be incremented and control passes back to diamond 330 discussed above. Note further that if no change in a given core state is determined at diamond 330, the same incrementing of cores can be done at block 333 with control passing also to diamond 330.
Still referring to
Once the state of all of the cores has been determined and the various values updated accordingly, control passes to block 380 where the accumulators can be sampled to obtain the maximum performance residency value and the active state residency value. Control then passes to block 390 where from these values averages, e.g., moving averages, of the maximum performance residency and the active state residency can be generated. In one embodiment, these values can then be used to determine a ratio between them, and from which a given power management policy selection can be made. Although shown with this specific implementation in the embodiment of
Referring now to Table 1 shown is an accumulator algorithm pseudocode in accordance with one embodiment of the present invention.
In one embodiment, the dynamic loadline tuning algorithm has three tunable parameters. Alpha, threshold high and threshold low. In some embodiments, these values can be tuned in a system with real workloads, which can run at different utilizations based on these values to ensure that switching happens at maximum performance.
Using embodiments of the present invention, an end user can realize reduced power consumption at low utilizations and more specifically, the end user can choose a preferred tuning policy at low/medium utilization, namely a tuning that maximizes the power savings for a target utilization. At high utilization, the processor can dynamically and automatically switch to a performance policy, thus preventing any performance loss. Accordingly, the processor can dynamically detect utilization and switch a power/performance policy dynamically based on utilization.
In this way, a user who previously used a performance policy can realize power savings for a typical usage (e.g., low/medium utilization) by choosing a power saver or balanced mode without being concerned about losing peak performance.
Embodiments can be implemented in processors for various markets including server processors, desktop processors, mobile processors and so forth. Referring now to
In various embodiments, power control unit 455 may include a dynamic policy switching logic 459, which may be a logic to perform dynamic switching of a power management policy based on processor utilization. As further seen, various registers or other storages can be present and accessed by the logic. Specifically, state mask storage 456 can store masks associated with the active state and the maximum performance state, including a current and previous state mask for each state, each having an indicator for each core to indicate whether the core is in the corresponding state. In addition, residency counters (not shown in
With further reference to
Referring now to
In general, each core 510 may further include low level caches in addition to various execution units and additional processing elements. In turn, the various cores may be coupled to each other and to a shared cache memory formed of a plurality of units of a last level cache (LLC) 5400-540n. In various embodiments, LLC 540 may be shared amongst the cores and the graphics engine, as well as various media processing circuitry. As seen, a ring interconnect 530 thus couples the cores together, and provides interconnection between the cores, graphics domain 520 and system agent circuitry 550. In one embodiment, interconnect 530 can be part of the core domain. However in other embodiments the ring interconnect can be of its own domain.
As further seen, system agent domain 550 may include display controller 552 which may provide control of and an interface to an associated display. As further seen, system agent domain 550 may include a power control unit 555 which can include a dynamic policy switching logic 559 in accordance with an embodiment of the present invention to dynamically control an active power management providing for a system based on a processor utilization, e.g., using information obtained from a policy management storage 557. In various embodiments, this logic may execute the algorithms described above in
As further seen in
Embodiments may be implemented in many different system types. Referring now to
Still referring to
Furthermore, chipset 690 includes an interface 692 to couple chipset 690 with a high performance graphics engine 638, by a P-P interconnect 639. In turn, chipset 690 may be coupled to a first bus 616 via an interface 696. As shown in
Referring now to
Embodiments may be implemented in code and may be stored on a non-transitory storage medium having stored thereon instructions which can be used to program a system to perform the instructions. The storage medium may include, but is not limited to, any type of disk including floppy disks, optical disks, solid state drives (SSDs), compact disk read-only memories (CD-ROMs), compact disk rewritables (CD-RWs), and magneto-optical disks, semiconductor devices such as read-only memories (ROMs), random access memories (RAMs) such as dynamic random access memories (DRAMs), static random access memories (SRAMs), erasable programmable read-only memories (EPROMs), flash memories, electrically erasable programmable read-only memories (EEPROMs), magnetic or optical cards, or any other type of media suitable for storing electronic instructions.
While the present invention has been described with respect to a limited number of embodiments, those skilled in the art will appreciate numerous modifications and variations therefrom. It is intended that the appended claims cover all such modifications and variations as fall within the true spirit and scope of this present invention.
This application is a continuation of U.S. patent application Ser. No. 13/326,605, filed Dec. 15, 2011, now U.S. Pat. No. 9,372,524, the content of which is hereby incorporated by reference.
Number | Name | Date | Kind |
---|---|---|---|
5163153 | Cole et al. | Nov 1992 | A |
5522087 | Hsiang | May 1996 | A |
5572438 | Ehlers et al. | Nov 1996 | A |
5590341 | Matter | Dec 1996 | A |
5621250 | Kim | Apr 1997 | A |
5931950 | Hsu | Aug 1999 | A |
6748546 | Mirov et al. | Jun 2004 | B1 |
6792392 | Knight | Sep 2004 | B1 |
6823516 | Cooper | Nov 2004 | B1 |
6829713 | Cooper et al. | Dec 2004 | B2 |
6996728 | Singh | Feb 2006 | B2 |
7010708 | Ma | Mar 2006 | B2 |
7043649 | Terrell | May 2006 | B2 |
7093147 | Farkas et al. | Aug 2006 | B2 |
7111179 | Girson et al. | Sep 2006 | B1 |
7194643 | Gonzalez et al. | Mar 2007 | B2 |
7272730 | Acquaviva et al. | Sep 2007 | B1 |
7346787 | Vaidya et al. | Mar 2008 | B2 |
7412615 | Yokota et al. | Aug 2008 | B2 |
7434073 | Magklis | Oct 2008 | B2 |
7437270 | Song et al. | Oct 2008 | B2 |
7454632 | Kardach et al. | Nov 2008 | B2 |
7529956 | Stufflebeam | May 2009 | B2 |
7539885 | Ma | May 2009 | B2 |
7647513 | Tobias et al. | Jan 2010 | B2 |
7689847 | Chang et al. | Mar 2010 | B2 |
7702937 | Oh et al. | Apr 2010 | B2 |
7702938 | Ha | Apr 2010 | B2 |
7730340 | Hu et al. | Jun 2010 | B2 |
8010822 | Marshall et al. | Aug 2011 | B2 |
8112647 | Branover et al. | Feb 2012 | B2 |
8296773 | Bose | Oct 2012 | B2 |
8607083 | Kaburlasos | Dec 2013 | B2 |
8689028 | Diefenbaugh et al. | Apr 2014 | B2 |
9075609 | Bircher | Jul 2015 | B2 |
20010044909 | Oh et al. | Nov 2001 | A1 |
20020194509 | Plante et al. | Dec 2002 | A1 |
20030009701 | Kurosawa | Jan 2003 | A1 |
20030061383 | Zilka | Mar 2003 | A1 |
20040064752 | Kazachinsky et al. | Apr 2004 | A1 |
20040098560 | Storvik et al. | May 2004 | A1 |
20040139356 | Ma | Jul 2004 | A1 |
20040268166 | Farkas et al. | Dec 2004 | A1 |
20050022038 | Kaushik et al. | Jan 2005 | A1 |
20050033881 | Yao | Feb 2005 | A1 |
20050132238 | Nanja | Jun 2005 | A1 |
20050283624 | Kumar et al. | Dec 2005 | A1 |
20060050670 | Hillyard et al. | Mar 2006 | A1 |
20060053326 | Naveh | Mar 2006 | A1 |
20060059286 | Bertone et al. | Mar 2006 | A1 |
20060069936 | Lint et al. | Mar 2006 | A1 |
20060117202 | Magklis et al. | Jun 2006 | A1 |
20060184287 | Belady et al. | Aug 2006 | A1 |
20070005995 | Kardach et al. | Jan 2007 | A1 |
20070016817 | Albonesi et al. | Jan 2007 | A1 |
20070079294 | Knight | Apr 2007 | A1 |
20070106827 | Boatright et al. | May 2007 | A1 |
20070156992 | Jahagirdar | Jul 2007 | A1 |
20070214342 | Newburn | Sep 2007 | A1 |
20070239398 | Song et al. | Oct 2007 | A1 |
20070245163 | Lu et al. | Oct 2007 | A1 |
20070283176 | Tobias et al. | Dec 2007 | A1 |
20070288769 | Chang et al. | Dec 2007 | A1 |
20080028240 | Arai et al. | Jan 2008 | A1 |
20080250260 | Tomita | Oct 2008 | A1 |
20090006871 | Liu et al. | Jan 2009 | A1 |
20090007120 | Fenger et al. | Jan 2009 | A1 |
20090150695 | Song et al. | Jun 2009 | A1 |
20090150696 | Song et al. | Jun 2009 | A1 |
20090158061 | Schmitz et al. | Jun 2009 | A1 |
20090158067 | Bodas et al. | Jun 2009 | A1 |
20090172375 | Rotem et al. | Jul 2009 | A1 |
20090172428 | Lee | Jul 2009 | A1 |
20090235105 | Branover et al. | Sep 2009 | A1 |
20090249094 | Marshall et al. | Oct 2009 | A1 |
20090328055 | Bose | Dec 2009 | A1 |
20100058078 | Branover et al. | Mar 2010 | A1 |
20100077243 | Wang et al. | Mar 2010 | A1 |
20100115309 | Carvalho et al. | May 2010 | A1 |
20100146513 | Song | Jun 2010 | A1 |
20100162023 | Rotem | Jun 2010 | A1 |
20100169692 | Burton | Jul 2010 | A1 |
20100191997 | Dodeja et al. | Jul 2010 | A1 |
20100192149 | Lathrop et al. | Jul 2010 | A1 |
20110154090 | Dixon et al. | Jun 2011 | A1 |
20110246804 | Kaburlasos | Oct 2011 | A1 |
20110307730 | Marshall et al. | Dec 2011 | A1 |
20120079290 | Kumar | Mar 2012 | A1 |
20120246506 | Knight | Sep 2012 | A1 |
20130061064 | Ananthakrishnan et al. | Mar 2013 | A1 |
20130080803 | Ananthakrishnan et al. | Mar 2013 | A1 |
20130080804 | Ananthakrishan et al. | Mar 2013 | A1 |
20130111120 | Ananthakrishnan et al. | May 2013 | A1 |
20130111121 | Ananthakrishnan et al. | May 2013 | A1 |
20130111226 | Ananthakrishnan et al. | May 2013 | A1 |
20130111236 | Ananthakrishnan et al. | May 2013 | A1 |
20130346774 | Bhandaru et al. | Dec 2013 | A1 |
20140068290 | Bhandaru et al. | Mar 2014 | A1 |
20140195829 | Bhandaru et al. | Jul 2014 | A1 |
20140208141 | Bhandaru et al. | Jul 2014 | A1 |
Number | Date | Country |
---|---|---|
1183860 | Jun 1998 | CN |
101354661 | Jan 2009 | CN |
101794167 | Aug 2010 | CN |
1 282 030 | May 2003 | EP |
200416616 | Sep 2004 | TW |
201030506 | Aug 2010 | TW |
Entry |
---|
Intel Developer Forum, IDF2010, Opher Kahn, et al., “Intel Next Generation Microarchitecture Codename Sandy Bridge: New Processor Innovations,” Sep. 13, 2010, 58 pages. |
SPEC—Power and Performance, Design Overview V1.10, Standard Performance Information Corp., Oct. 21, 2008, 6 pages. |
Taiwan Patent Office, Office Action Mailed Jul. 30, 2014, in Taiwan Application No. 101147478. |
International Searching Authority, “Notification of Transmittal of the International Search Report and the Written Opinion of the International Searching Authority,” mailed Apr. 30, 2013, in International application No. PCT/US2012/069592. |
Tochinori Sato, et al., “Dependability, Power, and Performance Trade-Off on a Multicore Processor,” Mar. 2008, pp. 714-719. |
Luca Benini, et al., “Policy Optimization for Dynamic Power Management,” Jun. 1999, pp. 814-815. |
State Intellectual Property Office of the People's Republic of China, Office Action mailed Dec. 12, 2015 in Chinese Patent Application No. 2012800617706.6. |
State Intellectual Property Office of the People's Republic of China, Office Action mailed Dec. 16, 2015 in Chinese Patent Application No. 2012800617706.6. |
Intel Technology Journal, “Power and Thermal Management in the Intel Core Duo Processor,” May 15, 2006, pp. 109-122. |
Anoop Iyer, et al., “Power and Performance Evaluation of Globally Asynchronous Locally Synchronous Processors,” 2002, pp. 1-11. |
Greg Semeraro, et al., “Hiding Synchronization Delays in a GALS Processor Microarchitecture,” 2004, pp. 1-13. |
Joan-Manuel Parcerisa, et al., “Efficient Interconnects for Clustered Microarchitectures,” 2002, pp. 1-10. |
Grigorios Magklis, et al., “Profile-Based Dynamic Voltage and Frequency Scalling for a Multiple Clock Domain Microprocessor,” 2003, pp. 1-12. |
Greg Semeraro, et al., “Dynamic Frequency and Voltage Control for a Multiple Clock Domain Architecture,” 2002, pp. 1-12. |
Greg Semeraro, “Energy-Efficient Processor Design Using Multiple Clock Domains with Dynamic Voltage and Frequency Scaling,” 2002, pp. 29-40. |
Diana Marculescu, “Application Adaptive Energy Efficient Clustered Architectures,” 2004, pp. 344-349. |
L. Benini, et al., “System-Level Dynamic Power Management,” 1999, pp. 23-31. |
Ravindra Jejurikar, et al., “Leakage Aware Dynamic Voltage Scaling for Real-Time Embedded Systems,” 2004, pp. 275-280. |
Ravindra Jejurikar, et al “Dynamic Slack Reclamation With Procrastination Scheduling in Real-Time Embedded Systems,” 2005, pp. 13-17. |
R. Todling, et al., “Some Strategies for Kalman Filtering and Smoothing,” 1996, pp. 1-21. |
R.E. Kalman, “A New Approach to Linear Filtering and Prediction Problems,” 1960, pp. 1-12. |
Intel Corporation, “Intel 64 and IA-32 Architectures Software Developer's Manual,” vol. 3 (3A, 3B & 3C): System Programming Guide, Feb. 2014, Chapter 14 Power and Thermal Management (14.1-14.9.5), 44 pages. |
The State Intellectual Property Office of the People's Republic of China, Second Office Action mailed Aug. 25, 2016, in Chinese Patent Application No. 201280061770.6. |
State Intellectual Property Office of the People's Republic of China, Third Office Action mailed Jan. 25, 2017 in Chinese Patent Application No. 201280061770.6. |
Number | Date | Country | |
---|---|---|---|
20160266941 A1 | Sep 2016 | US |
Number | Date | Country | |
---|---|---|---|
Parent | 13326605 | Dec 2011 | US |
Child | 15162709 | US |