Portable computing devices (PDs) are ubiquitous. These devices may include cellular telephones, portable digital assistants (PDAs), portable game consoles, palmtop computers, and other portable electronic devices. In addition to the primary function of these devices, many include peripheral functions. For example, a cellular telephone may include the primary function of making cellular telephone calls and the peripheral functions of a still camera, a video camera, global positioning system (GPS) navigation, web browsing, sending and receiving emails, sending and receiving text messages, push-to-talk capabilities, etc. As the functionality of such a device increases, the computing or processing power required to support such functionality also increases. Further, as the computing power increases, there exists a greater need to effectively manage the processor, or processors, that provide the computing power.
Accordingly, what is needed is an improved method of controlling power within a multicore CPU.
In the figures, like reference numerals refer to like parts throughout the various views unless otherwise indicated.
The word “exemplary” is used herein to mean “serving as an example, instance, or illustration.” Any aspect described herein as “exemplary” is not necessarily to be construed as preferred or advantageous over other aspects.
In this description, the term “application” may also include files having executable content, such as: object code, scripts, byte code, markup language files, and patches. In addition, an “application” referred to herein, may also include files that are not executable in nature, such as documents that may need to be opened or other data files that need to be accessed.
The term “content” may also include files having executable content, such as: object code, scripts, byte code, markup language files, and patches. In addition, “content” referred to herein, may also include files that are not executable in nature, such as documents that may need to be opened or other data files that need to be accessed.
As used in this description, the terms “component,” “database,” “module,” “system,” and the like are intended to refer to a computer-related entity, either hardware, firmware, a combination of hardware and software, software, or software in execution. For example, a component may be, but is not limited to being, a process running on a processor, a processor, an object, an executable, a thread of execution, a program, and/or a computer. By way of illustration, both an application running on a computing device and the computing device may be a component. One or more components may reside within a process and/or thread of execution, and a component may be localized on one computer and/or distributed between two or more computers. In addition, these components may execute from various computer readable media having various data structures stored thereon. The components may communicate by way of local and/or remote processes such as in accordance with a signal having one or more data packets (e.g., data from one component interacting with another component in a local system, distributed system, and/or across a network such as the Internet with other systems by way of the signal).
Referring initially to
In a particular aspect, as depicted in
Referring to
As illustrated in
As further illustrated in
As depicted in
In a particular aspect, one or more of the method steps described herein may be stored in the memory 344 as computer program instructions. These instructions may be executed by the multicore CPU 324 in order to perform the methods described herein. Further, the multicore CPU 324, the memory 344, or a combination thereof may serve as a means for executing one or more of the method steps described herein in order to dynamically control the power of each CPU, or core, within the multicore CPU 324.
Referring to
Moreover, as illustrated, the memory 404 may include an operating system 420 stored thereon. The operating system 420 may include a scheduler 422 and the scheduler 422 may include a first run queue 424, a second run queue 426, and an Nth run queue 428. The memory 404 may also include a first application 430, a second application 432, and an Nth application 434 stored thereon.
In a particular aspect, the applications 430, 432, 434 may send one or more tasks 436 to the operating system 420 to be processed at the cores 410, 412, 414 within the multicore CPU 402. The tasks 436 may be processed, or executed, as single tasks, threads, or a combination thereof. Further, the scheduler 422 may schedule the tasks, threads, or a combination thereof for execution within the multicore CPU 402. Additionally, the scheduler 422 may place the tasks, threads, or a combination thereof in the run queues 424, 426, 428. The cores 410, 412, 414 may retrieve the tasks, threads, or a combination thereof from the run queues 424, 426, 428 as instructed, e.g., by the operating system 420 for processing, or execution, of those task and threads at the cores 410, 412, 414.
Otherwise, the method 500 may proceed to block 506 and the controller may calculate the optimal frequency for the CPU. At block 508, the DCVS may guarantee a steady state CPU utilization. Further, at block 510, the DCVS may guarantee a steady state CPU utilization deadline. Thereafter, the method 500 may end.
Referring to
At block 604, a power controller, e.g., a dynamic clock and voltage scaling (DCVS) algorithm, may set a responsiveness to a least possible responsiveness value. At decision 606, the power controller may determine whether the responsiveness is less than the fastest possible responsiveness value. If not, the method 600 may end. Conversely, if the responsiveness is less than the fastest possible responsiveness, the method 600 may move to block 608. At block 608, the power controller may set a time variable equal to one. Thereafter, at decision 610, the power controller may determine whether the time is less than or equal to a CPU utilization deadline. If not, the method may move to block 612, and the power controller may increase the responsiveness. Then, the method 600 may return to decision 606 and the method 600 may proceed as described herein.
Returning to decision 610, if the time is less than or equal to the CPU utilization deadline, the method may proceed to block 614 and the power controller may determine a steady state CPU frequency (SteadyStateCPUFreq) based on a responsiveness value, a filter (IIR), and a CPU busy time (CPUBusy).
Then, at decision 616, the power controller may determine whether the SteadyStateCPUFreq is greater than or equal to a maximum CPU frequency (MaxCPUFreq). If the SteadyStateCPUFreq is not greater than or equal to the MaxCPUFreq, the method may move to block 618 and the power controller may increase the time variable by one integer (time=time+1). Thereafter, the method 600 may return to decision 610 and the method 600 may continue as described herein.
Returning to decision 616, if the SteadyStateCPUFreq is greater than or equal to the MaxCPUFreq, the method 600 may continue to block 620 and the power controller may set a steady state responsiveness variable (SteadyStateResp) equal to the responsiveness value. The method 600 may then end.
Referring to
Moving to decision 710, the power controller may determine whether the Alpha is greater than zero. If not, the method 700 may end. Conversely, if the Alpha is greater than zero, the method 700 may move to block 712. At block 712, the power controller may set a time variable equal to one. Thereafter, at decision 714, the power controller may determine whether the time is less than or equal to a CPU utilization deadline. If not, the method may move to block 716, and the power controller may decrease the Alpha by one integer (Alpha=Alpha−1). Then, the method 700 may return to decision 710 and the method 700 may proceed as described herein.
Returning to decision 714, if the time is less than or equal to the CPU utilization deadline, the method may proceed to block 718 and the power controller may determine a steady state CPU frequency (SteadyStateCPUFreq) based on a variable (Alpha), a filter (IIR), and a CPU busy time (CPUBusy). Then, at decision 720, the power controller may determine whether the SteadyStateCPUFreq is greater than or equal to a maximum CPU frequency (MaxCPUFreq). If the SteadyStateCPUFreq is not greater than or equal to the MaxCPUFreq, the method may move to block 722 and the power controller may increase the time variable by one integer (time=time+1). Thereafter, the method 700 may return to decision 714 and the method 700 may continue as described herein.
Returning to decision 720, if the SteadyStateCPUFreq is greater than or equal to the MaxCPUFreq, the method 700 may continue to block 724 and the power controller may set a steady state alpha variable (SteadyStateAlpha) equal to Alpha. The method 700 may then end.
Moving to decision 810, the power controller may determine whether the Alpha is greater than zero. If not, the method 800 may proceed to block 826 and the controller may set a steady state alpha variable (SteadyStateAlpha) equal to a best alpha value. Also, the controller may set a steady state headroom variable to a best headroom value. Thereafter, the method 800 may end.
Returning to decision 810, if the Alpha is greater than zero, the method 800 may move to block 812. At block 812, the power controller may set a headroom percentage (HeadroomPCT) variable equal to one. Thereafter, at decision 814, the power controller may determine whether the headroom percentage is less than a CPU utilization. If not, the method may move to block 816, and the power controller may decrease the Alpha by one integer (Alpha=Alpha−1). Then, the method 800 may return to decision 810 and the method 800 may proceed as described herein.
Returning to decision 814, if the headroom percentage is less than the CPU utilization, the method may proceed to block 818 and the power controller may determine whether an effective CPU utilization is greater than a best effective CPU utilization. If not, the method 800 may move to block 820 and the power controller may increase the headroom percentage variable by one integer (HeadroomPCT=HeadroomPCT+1). Thereafter, the method 800 may return to decision 814 and the method 800 may continue as described herein.
Returning to decision 818, if the effective CPU utilization is greater than the best effective CPU utilization, the method 800 may continue to decision 822 and the controller may determine whether the filter is responding fast enough, e.g., using the method steps shown in
Referring now to
EffectiveCPUUtilization=((maxFreq*CPUUtilizationPct)/EffectiveFrequency
EffectiveFrequency=(((maxFreq+minFreq>>alpha))/CPUUtilizationPct−HeadroomPCT))*100)
where,
After the EffectiveCPUUtilization is determined at block 906, the method 900 may end.
Moving to block 1008, a steady state filter, IIR, may be set to ((2̂(IIR_Size−alpha))−1). At block 1010, it may be determined whether IIR2Freq is greater than a maximum frequency, maxFreq. If not, the method 1000 may move to block 1012, and it may be indicated that the filter is responding within a predetermined time, e.g., it is responding fast enough. Thereafter, the method 1000 may end.
Returning to decision 1010, if IIR2Freq is less than the maximum frequency, the method 1000 may proceed to block 1014 and a steady state IIR value may be set to zero. Thereafter, it may be determined whether the BusyMS is greater than zero and IIR2Freq is less than maxFreq. If not, the method 1000 may proceed to block 1012 and the method 1000 may continue as described herein. If so, the method 1000 may proceed to decision 1018 and it may be determined whether the IdleMS is greater than zero. If so, the method 1000 may move to block 1020 and a busyPulse value is set equal to ceiling(busyMS/idleMS), where ceiling means rounding to the next highest integral value if (busyMS/idleMS) contains a non-zero fractional part. Also, an idlePulse value is set equal to ceiling(idleMS/busyMS). Thereafter, at block 1022, an UpdateIIRBusy method may be executed in order to update the steady state IIR for the integral number of busy cycles previously calculated. For example, the UpdateIIRBusy method may be the UpdateIIRBusy method shown in
IIR=(IIR−(IIR>>alpha))+((1<<(IIR_Size−alpha))−1
After IIR is determined at block 1206, the method 1200 may move to block 1208 and the duration may be reduce by one integer (duration=duration−1). The method 1200 may then return to decision 1204 and continue as described herein.
It is to be understood that the method steps described herein need not necessarily be performed in the order as described. Further, words such as “thereafter,” “then,” “next,” etc. are not intended to limit the order of the steps. These words are simply used to guide the reader through the description of the method steps. Moreover, the methods described herein are described as executable on a portable computing device (PCD). The PCD may be a mobile telephone device, a portable digital assistant device, a smartbook computing device, a netbook computing device, a laptop computing device, a desktop computing device, or a combination thereof.
The system and methods described herein provide a way to prevent the DCVS from lagging the workload too far and causing task to fail. The system and methods utilize a steady state performance guarantee. The steady state performance guarantee may be a maximum amount of time (aka deadline) that a CPU may exceed a specified CPU utilization, i.e., a busy percentage. Using the steady state performance guarantee an ad-hoc analysis of the DCVS algorithm and related performance characteristics in order to meet QoS requirements may be eliminated.
The steady state performance component may be modeled as a filter and the filter parameters may be calculated such that the responsiveness of the filter is guaranteed to meet the steady state CPU utilization limit and the steady state CPU utilization limit deadline. For example, in a particular aspect, to meet a maximum of ninety percent (90%) CPU utilization requirement in a 1000 millisecond deadline, it may be possible to configure a simple IIR filter with a 1 millisecond granularity busy/idle input with an alpha of 26 (depending on the performance levels.) In a particular aspect, to determine the correct value for alpha, the filter may be set to its lowest value and then, a busy/idle chain may be executed into the filter to match the CPU utilization limit. Then for each possible alpha, the largest alpha that meets the CPU utilization deadline may be chosen.
In one or more exemplary aspects, the functions described may be implemented in hardware, software, firmware, or any combination thereof. If implemented in software, the functions may be stored on or transmitted over as one or more instructions or code on a computer program product such as a machine readable medium, i.e., a computer-readable medium. Computer-readable media includes both computer storage media and communication media including any medium that facilitates transfer of a computer program from one place to another. A storage media may be any available media that may be accessed by a computer. By way of example, and not limitation, such computer-readable media may comprise RAM, ROM, EEPROM, CD-ROM or other optical disk storage, magnetic disk storage or other magnetic storage devices, or any other medium that may be used to carry or store desired program code in the form of instructions or data structures and that may be accessed by a computer. Also, any connection is properly termed a computer-readable medium. For example, if the software is transmitted from a website, server, or other remote source using a coaxial cable, fiber optic cable, twisted pair, digital subscriber line (DSL), or wireless technologies such as infrared, radio, and microwave, then the coaxial cable, fiber optic cable, twisted pair, DSL, or wireless technologies such as infrared, radio, and microwave are included in the definition of medium. Disk and disc, as used herein, includes compact disc (CD), laser disc, optical disc, digital versatile disc (DVD), floppy disk and blu-ray disc where disks usually reproduce data magnetically, while discs reproduce data optically with lasers. Combinations of the above should also be included within the scope of computer-readable media.
Although selected aspects have been illustrated and described in detail, it will be understood that various substitutions and alterations may be made therein without departing from the spirit and scope of the present invention, as defined by the following claims.
The present application claims priority to U.S. Provisional Patent Application Ser. No. 61/286,999, entitled SYSTEM AND METHOD OF DYNAMICALLY CONTROLLING A CENTRAL PROCESSING UNIT, filed on Dec. 16, 2009, the contents of which are fully incorporated by reference. The present application is related to, and incorporates by reference, U.S. patent application Ser. No. ______, entitled SYSTEM AND METHOD FOR CONTROLLING CENTRAL PROCESSING UNIT POWER BASED ON INFERRED WORKLOAD PARALLELISM, by Rychlik et al., filed concurrently (Attorney Docket Number 100328U1). The present application is related to, and incorporates by reference, U.S. patent application Ser. No. ______, entitled SYSTEM AND METHOD FOR CONTROLLING CENTRAL PROCESSING UNIT POWER IN A VIRTUALIZED SYSTEM, by Rychlik et al., filed concurrently (Attorney Docket Number 100329U1). The present application is related to, and incorporates by reference, U.S. patent application Ser. No. ______, entitled SYSTEM AND METHOD FOR ASYNCHRONOUSLY AND INDEPENDENTLY CONTROLLING CORE CLOCKS IN A MULTICORE CENTRAL PROCESSING UNIT, by Rychlik et al., filed concurrently (Attorney Docket Number 100330U1). The present application is related to, and incorporates by reference, U.S. patent application Ser. No. ______, entitled SYSTEM AND METHOD FOR CONTROLLING CENTRAL PROCESSING UNIT POWER WITH REDUCED FREQUENCY OSCILLATIONS, by Thomson et al., filed concurrently (Attorney Docket Number 100339U1). The present application is related to, and incorporates by reference, U.S. patent application Ser. No. ______, entitled SYSTEM AND METHOD FOR CONTROLLING CENTRAL PROCESSING UNIT POWER WITH GUARANTEED TRANSIENT DEADLINES, by Thomson et al., filed concurrently (Attorney Docket Number 100340U1). The present application is related to, and incorporates by reference, U.S. patent application Ser. No. ______, entitled SYSTEM AND METHOD FOR DYNAMICALLY CONTROLLING A PLURALITY OF CORES IN A MULTICORE CENTRAL PROCESSING UNIT BASED ON TEMPERATURE, by Sur et al., filed concurrently (Attorney Docket Number 100344U1).
Number | Date | Country | |
---|---|---|---|
61286999 | Dec 2009 | US |