This invention relates to processor performance control apparatus and methods.
The Advanced Configuration and Power Interface (ACPI) specification provides a standard for operating system-centric device configuration and power management. The ACPI specification defines various “states” as levels of power usage and/or features availability. ACPI states include: global states, (e.g., G0-G3), device states, (e.g., D0-D3), processor states, (e.g., C0-C3), and performance states, (e.g., P0-Pn). The operating system and/or a user may select a desired processor state and a desired performance state. P-States are generally associated with fixed processor core frequency and voltage values. In a multi-core processor, the fixed frequency and voltage values are selected assuming that all processor cores are operating with 100% load. Such an arrangement does not maximize the performance of active processor cores when some processor cores are idle.
A central processing unit (processor) having multiple cores and a method for controlling the performance of the processor are presented. The processor includes a first storage location configured to store a first threshold associated with a first boost performance state (P-State). The processor also includes logic circuitry configured to increase performance of active processor cores when an inactive processor core count meets or exceeds the first threshold. The processor may also include a second storage location configured to store a second threshold associated with a second boost P-State. The storage locations may be programmable. The logic circuitry may be configured to compare the inactive processor core count to the first and second thresholds, select one of the first and second boost P-States and increase performance of active processor cores based on the selected boost P-State.
The processor may also include a core performance manager configured to increase performance of active processor cores by adjusting processor core frequency or core voltage. A third storage location may be provided to store the inactive processor core count. The logic circuitry may be configured to detect inactive processor cores and update the inactive processor core count. The logic circuitry may be configured to receive a boost processor state (C-State), wherein the first boost P-State is associated with the boost C-State.
The processor may include a plurality of storage locations configured to store thresholds for a plurality of boost P-States configured in priority order. The logic circuitry may be configured to select one of the plurality of boost P-States based on the inactive processor core count. The processor may have a number of processor cores, wherein a maximum number of boost P-States is less than the number of processor cores. The plurality of boost P-State threshold may be processed in descending order. The plurality of boost P-State thresholds may be configured such that at least one boost P-State threshold is associated with a range of inactive processor core counts.
While operating in the C0 state, a given processor core may also be associated with one of several performance states or “P-States” (P0-Pn). P0 is typically the highest-performance state. P1-Pn are successively lower-performance states. Typically n is no greater than 16. Each P-State is associated with a processor core operating frequency and core voltage, (e.g., Vcore). It should be understood that the actual power dissipation of a given processor, single or multi-core, when operating with a fixed frequency and core voltage, will vary with load. Multi-core processor packages are limited by the amount of Electrical Design Current (EDC) that the voltage regulator may supply. The operating frequency and core voltage for the P0 state is selected assuming 100% loading on all processor cores. For example, with all processor cores operating in a P0 state and 100% load, a given processor will use approximately the maximum allowable EDC. This same processor operating in a P0 state with only a single core operating at 100% and the other cores are idle cannot take advantage of remaining EDC headroom. This may result in inefficient use of available EDC headroom when one or more processor cores are idle.
In order to leverage EDC headroom, a new boost state may be defined. In this state, active processor cores may utilize available EDC headroom to provide higher performance in those processor cores which are idle. This may result in higher overall system performance under less than full load.
The power management control logic 34 also includes boost logic configured to manage operation of the processor in the Boost C-State. The power management control logic 34 may access one or more storage locations 40 configured to store the Boost P-State information.
Storage locations 40 may be programmable, for example a set of m−1 registers. In this example, with eight processor cores, a maximum of seven registers may be used. Each register is configured with a threshold number of inactive cores. Table 1 shows a sample configuration:
In the example above, Boost P-State-0 is associated with a threshold of 7 and is available when 7 processor cores are inactive. Boost P-State-1 is associated with a threshold of 6 and is available when 6 processor cores are inactive. Boost P-State-2 is associated with a threshold of 4 and is available when 4-5 processor cores are inactive. Boost P-State-3 is associated with a threshold of 1 and is available when 1-3 processor cores are inactive. The remaining Boost P-States are reserved for future use. It should be understood that Boost P-State thresholds may be selected in a variety of configurations and that fewer or additional Boost P-States may be defined.
In general, the Boost Logic 36 is configured to track the number of inactive processor cores. Boost Logic 36 may access storage location 42 for storage of a boost count, (e.g., inactive processor core count). Boost Logic 36 is also configured to select the appropriate Boost P-State based on the boost count.
Boost P-State thresholds are enforced in a priority order favoring the highest possible Boost P-State. The Boost P-State and processor performance will generally move up or down based on the boost count. Boost P-State processing begins with block A. Processing will commence at this block only when the processor is operating in the Boost C-State. It should be understood that the operations shown in
If the boost count is less than the Boost P-State-0 threshold, then the threshold for the next Boost P-State is selected, (e.g., Boost P-State-1), as shown by block 110. The boost count is compared to the Boost P-State-1 threshold as shown by block 106. If the boost count is greater than or equal to the Boost P-State-1 threshold, then the Boost P-State-1 will remain selected and processing may continue as shown by block B. This process is continued until the last Boost P-State is selected. Once a new Boost P-State is selected, the boost logic 36 is configured to change the core frequency and/or voltage in the active processor cores if the new P-state is different than the current one.
Although features and elements are described above in particular combinations, each feature or element may be used alone without the other features and elements or in various combinations with or without other features and elements. The apparatus described herein may be manufactured by using a computer program, software, or firmware incorporated in a computer-readable storage medium for execution by a general purpose computer or a processor. Examples of computer-readable storage mediums include a read only memory (ROM), a random access memory (RAM), a register, cache memory, semiconductor memory devices, magnetic media such as internal hard disks and removable disks, magneto-optical media, and optical media such as CD-ROM disks, and digital versatile disks (DVDs).
Embodiments of the present invention may be represented as instructions and data stored in a computer-readable storage medium. For example, aspects of the present invention may be implemented using Verilog, which is a hardware description language (HDL). When processed, Verilog data instructions may generate other intermediary data (e.g., netlists, GDS data, or the like) that may be used to perform a manufacturing process implemented in a semiconductor fabrication facility. The manufacturing process may be adapted to manufacture semiconductor devices (e.g., processors) that embody various aspects of the present invention.
Suitable processors include, by way of example, a general purpose processor, a special purpose processor, a conventional processor, a digital signal processor (DSP), a plurality of microprocessors, a graphics processing unit (GPU), a DSP core, a controller, a microcontroller, application specific integrated circuits (ASICs), field programmable gate arrays (FPGAs), any other type of integrated circuit (IC), and/or a state machine, or combinations thereof.