Embodiments of the invention relate to power and performance management in a system with multiple processor cores.
Low-power computing has been growing in importance given the demands of current and emerging mobile devices. To deliver high performance, mobile devices such as smart phones, tablets, and other handheld devices often integrate increasingly advanced technologies at the expense of high power consumption. One approach for enhancing runtime performance is Dynamic Voltage and
Frequency Scaling (DVFS), which is a technique that automatically adjusts the operating frequency and voltage of a processor at runtime to boost performance. When the operating frequency and voltage increase, power consumption also increases, as the power consumption of an integrated circuit is proportional to C×V2×F, where C is the transistor capacitance, V is the supply voltage and F is the frequency.
Some modern computer systems have a built-in power management framework to manage the tradeoff between performance and power usage. For example, a computer system may include a power management unit that determines at runtime whether to increase or decrease operating frequency and voltage in order to satisfy system performance requirements or to save power. The power management unit sets the operating frequency and voltage within predetermined upper and lower limits that are typically determined by system developers or manufactures based on experiment data.
Advanced computing systems have commonly adopted the multi-processor architecture to provide high performance. A multi-processor system includes multiple processors (also referred to as central processing units (CPUs), processor cores, or cores) organized as one or more clusters. The demand for high performance and low power in a multi-processor system heightens the need for enhanced power management.
In one embodiment, a method of a computing system is provided. The method comprises detecting a condition in which a total number of active processor cores within one or more clusters is less than a predetermined number, and an operating frequency of the active processor cores has risen to a specified highest frequency; and obtaining an ambient temperature measurement of the one or more clusters. The method further comprises, upon detecting the condition, increasing the operating frequency above the specified highest frequency based on the ambient temperature measurement while maintaining a same level of supply voltage to the active processor cores.
In another embodiment, a computing system is provided. The computing system comprises one or more clusters including a plurality of processor cores; a temperature sensor to obtain an ambient temperature measurement of the one or more clusters; and a management module coupled to the one or more clusters to detect a condition in which a total number of active processor cores within the one or more clusters is less than a predetermined number, and an operating frequency of the active processor cores has risen to a specified highest frequency. The computing system further comprises a frequency controller coupled to the one or more clusters, the temperature sensor and the management module, which, upon detection of the condition, is operative to increase the operating frequency above the specified highest frequency based on the ambient temperature measurement while maintaining a same level of supply voltage to the active processor cores.
In yet another embodiment, a method of a computing system is provided. The method comprises detecting a condition in which a total number of active processor cores within one or more clusters is less than a predetermined number, and an operating frequency of the active processor cores has risen to a specific frequency, which is any of one or more specified frequencies of a plurality of specified frequencies, wherein the plurality of specified frequencies are configured for all processor cores in the one or more clusters being active; and obtaining an ambient temperature measurement of the one or more clusters. The method further comprises, upon detecting the condition, increasing the operating frequency above the specified frequency based on the ambient temperature measurement while maintaining a same level of supply voltage to the active processor cores.
In yet another embodiment, a computing system is provided. The computing system comprises one or more clusters including a plurality of processor cores; a temperature sensor to obtain an ambient temperature measurement of the one or more clusters; and a management module coupled to the one or more clusters to detect a condition in which a total number of active processor cores within the one or more clusters is less than a predetermined number, and an operating frequency of the active processor cores has risen to a specific frequency, which is any of one or more specified frequencies of a plurality of specified frequencies, wherein the plurality of specified frequencies are configured for all processor cores in the one or more clusters being active. The computing system further comprises a frequency controller coupled to the one or more clusters, the temperature sensor and the management module, which, upon detection of the condition, is operative to increase the operating frequency above the specified frequency based on the ambient temperature measurement while maintaining a same level of supply voltage to the active processor cores.
In the embodiments described herein, the operation frequency of the processor cores is not constrained by the specified highest frequency and/or one or more specified frequencies since these frequencies are generally designed for ensuring a safe operation when all of the processor cores are active. Moreover, the one or more specified frequencies can be surpassed without requiring the level of the supply voltage to the processor cores to be increased. In addition, the operation frequency of the processor cores can be adjusted dynamically based on a real-time detection of ambient temperature, thus preventing overheat or any thermal issues. Furthermore, because the IR drop is greater for a lesser number of active cores, the operation frequency of the processor cores for fewer active processor cores can be adjusted to surpass the specified frequency to a greater extent. Consequently, the processor cores can operate to a full or increased extent of their capabilities and the power resource can be more efficiently utilized while the same level of the supply voltage is maintained.
In yet another embodiment, a method for operating a computing system is provided. The method comprises: detecting a condition in which a total number of active processor cores within one or more clusters is less than a predetermined number, and an operating frequency of the active processor cores has risen to a specific frequency, which is any of one or more specified frequencies of a plurality of specified frequencies, wherein the plurality of specified frequencies are configured for all processor cores in the one or more clusters being active; and obtaining an ambient temperature measurement of the one or more clusters. The method further comprises: when the condition is detected, decreasing a level of supply voltage to the active processor cores based on the ambient temperature measurement while maintaining an operating frequency at the specified frequency.
In yet another embodiment, a computing system is provided. The computing system comprises one or more clusters including a plurality of processor cores; a temperature sensor to obtain an ambient temperature measurement of the one or more clusters; a management module configured to detect a condition in which a total number of active processor cores within the one or more clusters is less than a predetermined number, and an operating frequency of the active processor cores has risen to a specific frequency, which is any of one or more specified frequencies of a plurality of specified frequencies, wherein the plurality of specified frequencies are configured for all processor cores in the one or more clusters being active; and a frequency controller configured to, when the condition is detected, decrease a level of supply voltage to the active processor cores based on the ambient temperature measurement while maintaining an operating frequency at the specified frequency.
The present invention is illustrated by way of example, and not by way of limitation, in the figures of the accompanying drawings in which like references indicate similar elements. It should be noted that different references to “an” or “one” embodiment in this disclosure are not necessarily to the same embodiment, and such references mean at least one. Further, when a particular feature, structure, or characteristic is described in connection with an embodiment, it is submitted that it is within the knowledge of one skilled in the art to effect such feature, structure, or characteristic in connection with other embodiments whether or not explicitly described.
In the following description, numerous specific details are set forth. However, it is understood that embodiments of the invention may be practiced without these specific details. In other instances, well-known circuits, structures and techniques have not been shown in detail in order not to obscure the understanding of this description. It will be appreciated, however, by one skilled in the art, that the invention may be practiced without such specific details. Those of ordinary skill in the art, with the included descriptions, will be able to implement appropriate functionality without undue experimentation.
It should be noted that the term “multi-processor computing system” as used herein is equivalent to a “multi-core processor system,” which may be a multi-core system or a multi-processor system, depending upon the actual design. In other words, the proposed method may be employed by any of the multi-core system and the multi-processor system. For example, concerning the multi-core system, all of the processor cores may be disposed in one processor. For another example, concerning the multi-processor system, each of the processor cores may be disposed in one processor. Hence, each of the clusters may be implemented as a group of one or more processors.
Embodiments of the invention provide a system and method for managing power and performance in a multi-processor computing system. The system includes one or more clusters, and each cluster includes one or more processor cores. The processor cores may be CPUs, other types of processors or cores). The operating frequency of the one or more processor cores can be dynamically adjusted during operation in response to varying performance requirements and thermal constraints. Generally, the performance requirement can be quantified by the response time and throughput under a given system workload. In a scenario where system workload can be parallelized and distributed to multiple processor cores, an increase in the workload may cause more processor cores to be turned on to fulfill a performance requirement. However, in a scenario where the system workload cannot be easily parallelized, turning on more processor cores may not help improve performance. Increasing operating frequency, on the other hand, accelerates the processor speed and therefore increases performance with respect to response time and/or throughput.
In one embodiment, the operating frequency can be dynamically increased to above the highest frequency (i.e., the upper limit) specified in a specification. This specified highest frequency may be stored in an Operating Performance Point (OPP) table or other forms of data structure or storage or circuit implementation in the system. Surpassing the specified highest frequency is allowed when the number of active processor cores in one or more clusters of the system is less than a predetermined number, which, in turn, is less than the total number of processor cores in the one or more clusters. As the specified highest frequency is defined for the scenario in which all processor cores are active, the presence of inactive processor cores allows those active processor cores to consume more than their normal shares of power by operating in a frequency higher than the specified highest frequency. In one embodiment, the amount of frequency increase is based on the ambient temperature of the one or more clusters. The allowed amount of frequency increase is smaller under a higher ambient temperature, and is larger under a lower ambient temperature. In another embodiment, the amount of frequency increase is based on the number of active processor cores as well as the ambient temperature of the one or more clusters. The allowed amount of frequency increase is smaller under a higher number of active cores, and is larger under a lower number of active cores.
A target frequency that the operating frequency is to be increased to may be determined according to another or the same OPP table or other forms of data structure or storage in the system that can predetermine or record a plurality of operating frequencies corresponding to different temperatures and/or number of active cores. Alternatively or additionally, the increase of the operating frequency may be achieved by directly implementing corresponding formula that calculates the target frequency in hardware and/or software forms. It is also noted that in one aspect, the operating frequency of active cores (when the number of the active cores is less than a given threshold) may be increased to surpass the specified (highest) frequency by setting different groups of specified (highest) frequencies for different numbers or number ranges of active cores, and/or for different temperatures or temperature ranges.
Additionally or alternatively, the operating frequency originally set to be any of one or more specified frequencies below the specified highest frequency during operation of the multi-processor computing system can be dynamically increased to above the one or more other specified frequencies. Similarly, as these one or more specified frequencies are also defined for the scenario in which all processor cores are active, the presence of inactive processor cores during runtime allows those active processor cores to operate in a frequency higher than the specified frequencies. In other words, the one or more specified frequencies can be surpassed due to an IR drop for the clusters when the number of active processor cores is decreased. In one embodiment, the amount of frequency increase for each of the one or more specified frequency is based on the ambient temperature of the one or more clusters. The allowed amount of frequency increase is smaller under a higher ambient temperature, and is larger under a lower ambient temperature. In another embodiment, the amount of frequency increase for each of the one or more specified frequency is based on the number of active processor cores as well as the ambient temperature of the one or more clusters. The allowed amount of frequency increase is smaller under a higher number of active cores, and is larger under a lower number of active cores.
In the embodiments described herein, the supply voltage to the active processor cores is maintained at the same level before and after the adjustment to the operating frequency, i.e., the same voltage level regardless the number of active cores. This voltage level, which may be referred to as a safety voltage, is designed for the scenarios where all processor cores in the system are active. The presence of one or more inactive processor cores in the system causes an IR drop, which is a form of unused power. Maintaining the supply voltage, rather than increasing or decreasing it with the operating frequency, not only stabilizes the circuit operation but also allows the unused power to be used by the active processor cores (which are less than all) by operating at higher frequencies. Thus, in an embodiment, the operating frequency can be increased such that the increased power consumption due to the increased operating frequency comes from the IR drop caused by having at least one inactive processor core in the one or more clusters. Further details of embodiments of the invention will be provided below.
In one embodiment, the presence of inactive processor cores allows the specified highest frequency to be maintained while the supply voltage to the active processor cores is decreased. As the supply voltage is defined for the scenario in which all processor cores are active, the presence of inactive processor cores allows those active processor cores to operate in the specified highest frequency but with a lower level of the supply voltage. In other words, the level of the supply voltage can be lowered due to an IR drop for the clusters when the number of active processor cores is decreased. In one embodiment, the amount of voltage decrease is based on the ambient temperature of the one or more clusters. The allowed amount of voltage decrease is smaller under a higher ambient temperature, and is larger under a lower ambient temperature. In another embodiment, the amount of voltage decrease is based on the number of active processor cores as well as the ambient temperature of the one or more clusters. The allowed amount of frequency increase is smaller under a higher number of active cores, and is larger under a lower number of active cores.
According to the dynamic frequency scaling to be described herein, the management module 130 is operative to detect a predetermined condition which defines when to increase the operating frequency to surpass the specified highest frequency. The predetermined condition is preferably but not limited to that the total number of active processor cores within the one or more clusters 105 is less than a predetermined number, and an operating frequency of the active processor cores has risen to a specified highest frequency. Upon detection of the condition, the frequency controller 110 is operative to increase the operating frequency above the specified highest frequency based on the ambient temperature measurement while the voltage supply 120 maintaining the same level of supply voltage to the active processor cores.
In one embodiment, the management module 130 has access to a number of specified frequency values for operating the clusters. The terms “specified frequency,” “specified highest frequency,” “specified frequency values,” and the like, refer to frequency values predetermined for operating the clusters 105 in a scenario where all processor cores in the one or more clusters 105 are active. The specified highest frequency is the upper limit for the operating frequency when all processor cores in the one or more clusters 105 are active. The specified frequency values, including the specified highest frequency, may be stored in a table (e.g., an OPP table) or another form of structure accessible to the management module 130. When the performance demand increases, a higher frequency setting may be used as the operating frequency; conversely, when the performance demand decreases, a lower frequency may be used to reduce power consumption. According to the dynamic frequency described herein, when there is a high performance demand, the management module 130 may increase the operating frequency above the specified highest frequency.
The processor cores in the cluster 105 may be of the same processor type, which means that the processor cores have substantially the same characteristics in terms of power consumption and performance. Alternatively, the processor cores in the cluster 105 may be of different processor types, which means that they have different characteristics in terms of power consumption and/or performance. In an embodiment where the system 100 has one or more clusters 105 and the processor cores in all of the clusters 105 have the same processor type, the system 100 is referred to as having a symmetric multiprocessing (SMP) structure. In an alternative embodiment where the processor cores within the same cluster 105 or across different clusters 105 have different processor types, the system 100 is referred to as having a heterogeneous multiprocessing (HMP) structure.
The following describes three exemplary implementations of the system 100 for dynamically scaling the operating frequency in the system according to some embodiments. In a first exemplary implementation, the system 100 has an SMP structure, and the operating frequencies of the clusters 105 are dependent on one another. In this implementation, frequency scaling is applied to all clusters 105 in the system 100 (one cluster if the system 100 has only one cluster) when the management module 130 detects a first condition. According to the dynamic frequency scaling described herein, the first condition occurs when the following is true: (a) the operating frequency of the active processor cores has risen to a specified highest frequency, and (b) the total number of active processor cores in the system 100 is less than a predetermined number. When the first condition is detected, the operating frequency of all clusters 105 in the system 100 can be increased above the specified highest frequency based on the ambient temperature measurement of the clusters 105. The predetermined number in (b) may be any number lower than the total number of processor cores in the system 100. In an alternative embodiment, the specified highest frequency in (a) may be a specified frequency lower than the specified highest frequency. Additionally, the dynamic frequency scaling may be applied to not only the specified highest frequency but also one or more specified frequencies lower than the specified highest frequency.
In a second exemplary implementation, the system 100 has an SMP structure and the operating frequencies of the clusters 105 are independent of one another. In this implementation, frequency scaling is applied to each cluster 105 independently when a second condition is detected. In a third exemplary implementation, the system 100 has an HMP structure in which operating frequencies of the clusters 105 are independent of one another. In this implementation, frequency scaling is also applied to each cluster 105 independently when the second condition is detected. That is, the second condition is used for both the second and the third exemplary implementations. The second and the third exemplary implementations may also be applied to the system 100 that has only one cluster. According to the dynamic frequency scaling described herein, the second condition occurs when the following is true: (a) the operating frequency of a given cluster in the system 100 has reached a specified highest frequency, and (b) the total number of active processor cores in the given cluster is less than a predetermined number. When the second condition is detected, the operating frequency of the given cluster can be increased above the specified highest frequency based on the ambient temperature measurement of the given cluster. The predetermined number in (b) may be any number lower than the total number of processor cores in the given cluster. In an alternative embodiment, the specified highest frequency in (a) may be a specified frequency lower than the specified highest frequency. Additionally, the dynamic frequency scaling may be applied to not only the specified highest frequency but one or more specified frequencies lower than the specified highest frequency.
In one embodiment, when the first condition or the second condition is detected, the management module 130 determines a target frequency for increasing the operating frequency using a look-up table. The look-up table can be in any form of data/storage structure and/or implemented in hardware/software forms. Alternatively or additionally, the management module 130 can automatically increase the operating frequency by using corresponding formula for calculating the target frequency that is implemented in associated hardware and/or software.
When a frequency is identified in the table 310, the operating frequency of the one or more clusters in the system 100 may be increased to the identified frequency in one step, or alternatively, in multiple steps. The frequency values for each of the multiple steps may be stored in the table 310 or in a separate data structure accessible to the management module 130. Alternatively, the frequency values for each of the multiple steps may be calculated by the management module 130 as a fixed incremental amount or a fixed percentage increase.
In some embodiments, instead of having the same number of columns in each row and the same number of rows in each column as in the table 410, the frequency values may be stored in a data structure that includes multiple tables, with each table for a temperature range. For example, the data structure may include a first table for temperature T<T1 and a second table for T1<T<T2, and each table defines the frequency values for different number of active processor cores. In some alternative embodiments, each of the multiple tables may be for a different number of active processor cores. For example, the data structure include a first table for NP<2 and a second table for 2<NP≦4. Different tables may include the same or different numbers of entries.
When a frequency is identified in the table 410, the operating frequency of the one or more clusters in the system 100 may be increased to the identified frequency in one step, or alternatively, in multiple steps. The frequency values for each of the multiple steps may be stored in the table 410 or in a separate data structure accessible to the management module 130. Alternatively, the frequency values for each of the multiple steps may be calculated by the management module 130 as a fixed incremental amount or a fixed percentage increase.
In yet another embodiment, the management module 130 may calculate the amount of frequency increase by multiplying the current operating frequency by a percentage. For example, the operating frequency may have an 8% increase when T≦T1 and a 5% increase when T1<T≦T2. Alternatively, the amount of the percentage increase may be determined based on the number of active processor cores in addition to the ambient temperature measurement.
After a target frequency or an amount of frequency increase is determined and the operating frequency is increased accordingly, the management module 130 may repeatedly re-evaluate the condition of the system 100 during runtime to determine whether the first or second condition has changed, whether the temperature has changed, whether the number of active cores has changed, or the like. If the re-evaluation result indicates that the operating frequency should be re-adjusted, the management module 130 determines the operating frequency for the active processor cores by accessing stored data or by calculating the amount of frequency adjustment. In one embodiment, the operating frequency may be increased when the temperature is low to meet performance demands, and may be decreased when the temperature is high to reduce power consumption.
Proceeding to steps 520-540, the management module 130 compares the measured ambient temperature (T) with one or more temperature points to identify a temperature range to which T belongs. It is assumed that the temperatures T1<T2< . . . <Tn, and the frequencies F1>F2> . . . >Fn>F(n+1). The higher the temperature is, the lower the operating frequency. When a temperature range is found to include T, the operating frequency is adjusted to the corresponding frequency value. For example, if T≦T1, the operating frequency is adjusted to Fl at step 522; if T1<T≦T2, the operating frequency is adjusted to F2 at step 532; if T(n−1)<T≦Tn, the operating frequency is adjusted to Fn at step 542. If T is found to exceed Tn, the operating frequency is adjusted to F(n+1) at step 550. In one embodiment, the comparisons may be performed serially as shown in steps 520-540 of
The method 500 of
The methods 600 and 700 may be performed by hardware (e.g., circuitry, dedicated logic, programmable logic, microcode, etc.), software (e.g., instructions run on a processing device), or a combination thereof. In one embodiment, the methods 600 and 700 may be performed by the computing system 100 of
The operations of the flow diagrams of
Various functional components or blocks have been described herein. As will be appreciated by persons skilled in the art, the functional blocks will preferably be implemented through circuits (either dedicated circuits, or general purpose circuits, which operate under the control of one or more processors and coded instructions), which will typically comprise transistors that are configured in such a way as to control the operation of the circuity in accordance with the functions and operations described herein. The specific structure or interconnections of the transistors may be determined by a compiler, such as a register transfer language (RTL) compiler. RTL compilers operate upon scripts that closely resemble assembly language code, to compile the script into a form that is used for the layout or fabrication of the ultimate circuitry. RTL is well known for its role and use in the facilitation of the design process of electronic and digital systems.
In the embodiments described herein, the operation frequency of the processor cores is not constrained by the specified highest frequency and/or one or more specified frequencies since these frequencies are generally designed for ensuring a safe operation when all of the processor cores are active. Moreover, the one or more specified frequencies can be surpassed without requiring the level of the supply voltage to the processor cores to be increased. In addition, the operation frequency of the processor cores can be adjusted dynamically based on a real-time detection of ambient temperature, thus preventing overheat or any thermal issues. Furthermore, because the IR drop is greater for a lesser number of active cores, the operation frequency of the processor cores for fewer active processor cores can be adjusted to surpass the specified frequency to a greater extent. Consequently, the processor cores can operate to a full or increased extent of their capabilities and the power resource can be more efficiently utilized when the same level of the supply voltage is maintained.
While the invention has been described in terms of several embodiments, those skilled in the art will recognize that the invention is not limited to the embodiments described, and can be practiced with modification and alteration within the spirit and scope of the appended claims. The description is thus to be regarded as illustrative instead of limiting.
This application claims the benefit of U.S. Provisional Application No. 62/051,327 filed on Sep. 17, 2014, the entirety of which is incorporated herein by reference.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/CN2015/089855 | 9/17/2015 | WO | 00 |
Number | Date | Country | |
---|---|---|---|
62051327 | Sep 2014 | US |