In computing systems, a primary power source has a limited peak power delivery capability. Whether the primary power source is an AC power supply or battery, exceeding the power output capability may result in the output voltage drooping, which can cause system failure. And battery-powered systems have an output power capability that may vary as a function of the charge level of the battery, since as battery charge level drops, peak power delivery capability also drops. The peak output power capability is also a function of the battery configuration.
In existing platforms, these peak power delivery constraints can directly impact performance capability. Because peak power delivery capability violations can lead to system failure in a few 10's of microseconds (μs) or faster, these constraints are managed proactively, which means the system is designed and operated to prevent power consumption from ever exceeding the power delivery limit. Historically, a peak power delivery constraint has been managed by limiting maximum frequencies of high power-consuming circuitry within a processor to guarantee that peak power consumption stays below platform peak power delivery capability. The nature of a proactive, predictively managed constraint is that it is conservative, meaning that performance is constrained prior to reaching the physical limits of the platform.
In various embodiments, a processor is configured to provide higher performance capability for a given peak power delivery capability. This arrangement is realized by a protection mechanism that allows peak power delivery capability to be managed reactively, rather than proactively. Effectively, this means that performance is only constrained when instantaneous power consumption reaches a peak power delivery limit, rather than reducing operating frequency limits proactively. In embodiments, this reactive power control occurs based at least in part on detection of power consumption at a threshold, combined with the ability of individual high power-consuming circuits to reduce power consumption back below that threshold nearly instantaneously, e.g., within a few microseconds.
With embodiments, a programmable boost increase may be added to an existing (legacy) maximum power delivery capability of a given system. As a result, at least some circuitry may be allowed to operate at higher power consumption levels. This is so, since such circuitry may be provided with control circuitry to ensure any boosted power consumption drops below a legacy power consumption nearly instantaneously (e.g., within approximately 1-10 μs) of being signaled that power consumption is to be reduced.
With this arrangement, certain high power-consuming circuitry can operate at operating frequencies above a limit defined by the peak power delivery capability of the power supply network, hence delivering higher performance without violating power delivery requirements. The result is an increase in the maximum possible power consumed by the load, and hence better performance.
Embodiments may realize this reduction by immediately forcing at least some (or all) high load IP circuits to undergo power throttling, which substantially reduces IP power consumption. While different techniques are possible, in some cases, circuits can throttle power consumption by immediately reducing clock frequency of the IP circuit. Another approach is to invoke a micro-architectural mechanism to immediately constrain the throughput of the IP circuit, for example by reducing the number of instructions that a core can execute or retire in any given clock cycle. Yet other approaches may include clock duty cycling, for example, by chopping out clock edges, or performing another technique to reduce power consumption.
Referring now to
With respect to SoC 110, included are a plurality of cores. In the particular embodiment shown, two different core types are present, namely first cores 1120-112n (so-called efficiency cores (E-cores)) and second cores 1140-n (so-called performance cores (P-cores)). Understand that the various cores may include many different types of execution units, intellectual property (IP) circuits (one of which is illustrated as IP circuit 1150 within P-core 1140), and other circuitry. Although only a single IP circuit is shown, understand that SoC 110 may include many different IP circuits, present in one or more of cores 112, 113 GPU 120, and/or present as standalone processing circuits. As further shown, SoC 110 includes a graphics processing unit (GPU) 120 including a plurality of execution units (EUs) 1220-n. In one or more embodiments, first cores 112 and second cores 114 and/or GPU 120 may be implemented on separate dies.
These various computing elements couple to additional components of SoC 110, including a shared cache memory 125, which in an embodiment may be a last level cache (LLC) having a distributed architecture. In addition, a memory controller 130 is present along with a power controller 140, which may be implemented as a hardware control circuit that may be a dedicated microcontroller to execute instructions, e.g., stored on a non-transitory storage medium (e.g., firmware instructions). In other cases, power controller 140 may have different portions that are distributed across one or more of the available cores.
Still with reference to
Power controller 140 further includes a dynamic voltage and frequency scaling (DVFS) circuit 144. DVFS circuit 144 may be configured to dynamically adjust voltage and/or frequency of cores 112, 114, and/or GPU 120, depending upon environmental conditions including temperature, power budget and so forth. Note that DVFS circuit 144 may perform such power management operations at a slower frequency (e.g., approximately within a millisecond responsiveness) than the instantaneous (within approximately 1-10 μs) power throttling that individual IP circuits may perform independently and autonomously, responsive to receipt of an input voltage violation signal.
As further shown in
Voltage regulator controller 155 monitors the system voltage, e.g., via an included voltage comparator that compares the system voltage to a reference voltage, e.g., a so-called critical threshold (Vsys Critical). When this system voltage falls below the predetermined reference voltage, e.g., due to a loss of battery charge, problem with AC power or so forth, voltage regulator controller 155 issues an input voltage violation signal. As illustrated in
As shown, this input voltage violation signal is sent directly to power controller 140. Voltage regulator controller 155 is configured to also direct this input voltage violation signal directly to certain high power-consuming IP circuits. As shown for illustration example, these IP circuits may be cores 114 (and/or individual IP circuits therein). This signal may also be directly routed to other high power-consuming circuits within SoC 110 such as coprocessors, accelerators and so forth that may be present (not shown for ease of illustration in the high level view of
Thus a hardwired approach is used to provide hardwired paths to directly and instantaneously inform particular IP circuits regarding the input voltage violation. Responsive to this receipt, these IP circuits are configured to independently and autonomously take appropriate power throttling operations to instantaneously bring their power consumption level to a legacy maximum power level or lower, to avoid the system failure, as described further herein.
In embodiments, power controller 140 guarantees that the legacy maximum power capability plus the additional boost value is not exceeded at any time. Power controller 140 further guarantees that if a boost credit is applied, when the reaction mechanism (PROCHOT #) is invoked, peak power drops to a value less than (or equal to) the legacy maximum power capability within a few microseconds. This is so, as violating this requirement can lead to system failure.
To this end, power controller 140 may thus identify only those IP circuits capable of meeting these guarantees in allocating boosted power budgets. In addition, since during system operation, there is a delay in detecting an undervoltage condition and reacting to it, this detection and reaction delay is accounted for when determining the allowable boost value.
As illustrated, NVM 170 may store an operating system (OS) 172, additional system software 174 (which may include a basic input/output system (BIOS) and other firmware), and various applications, drivers and other software (generally identified at 176). In one or more embodiments, system software 174 may be configured to calculate a boost value, based at least in part on certain power supply characteristics. This determined boost value may be stored in a configuration storage 162, so that it may be accessed for use in determining a boosted maximum power level for SoC 110. In an embodiment, this boosted maximum power level may be equal to a sum of the boost value and a legacy maximum power level.
Understand while shown at this high level in the embodiment of
Referring now to
As further illustrated in
As further illustrated, a GPU 220 may include a media processor 222 and a plurality of EUs 224. Graphics processor 220 may be configured for efficiently performing graphics or other operations that can be broken apart for execution on parallel processing units such as EUs 224.
Still referring to
As further shown, SoC 200 also includes a memory 260 that may provide memory controller functionality for interfacing with a system memory such as DRAM. Understand while shown at this high level in the embodiment of
With embodiments herein, SoC 200 may be configured to operate at a boosted maximum power level as described herein. And then in response to an input voltage violation signal, particular IP circuits autonomously perform power throttling to instantaneously reduce power consumption to be within legacy levels, to avoid a system failure.
Referring now to
As shown in
Next at block 320, the boost value may be stored in a configuration storage. As an example, the boost value can be stored within a system memory and may also be stored in a power controller (e.g., within a configuration register). Using this boost value, the power controller may determine power budgets for given IP circuits. Note that the operations at blocks 310 and 320 may be used to configure a processor to be used in a given platform, based on characteristics of the power supply. Of course, in many implementations these operations may also occur dynamically as changes occur to a power source, particularly in situations in which the power source is a battery, as a function of battery charge level.
Thus the power controller receives this boost value that is based at least in part on one or more characteristics of a voltage regulator. Based at least in part on this boost value and a legacy maximum power level for the processor, the power controller determines a boosted maximum power level for the processor, and also determines a boosted maximum power budget for one or more IP circuits based at least in part on the boosted maximum power level
Still referring to
Note that these IP circuits are configured to control their own power consumption to remain within the boosted power budget (or, when an input voltage violation is identified, the lower legacy power budget). At this point the processor may enter into normal operation, where these high power-consuming IP circuits and other IP circuits within cores and other components execute instructions.
Still referring to
Such operation at throttled power levels may proceed until the PROCHOT #signal is no longer asserted (as determined with control passing back to diamond 340). Although shown at this high level in the embodiment of
With the arrangement in
Embodiments thus enable a load to allow operating points up to the legacy plus additional boost instantaneous power consumption, providing higher performance capability without having to increase the size or cost of the power supply and/or battery.
Referring now to
Note that the gap between levels 410 and 420, shown with arrows in
Processors 570 and 580 are shown including integrated memory controller (IMC) circuitry 572 and 582, respectively. Processor 570 also includes interface circuits 576 and 578; similarly, second processor 580 includes interface circuits 586 and 588. Processors 570, 580 may exchange information via the interface 550 using interface circuits 578, 588. IMCs 572 and 582 couple the processors 570, 580 to respective memories, namely a memory 532 and a memory 534, which may be portions of main memory locally attached to the respective processors.
Processors 570, 580 may each exchange information with a network interface (NW I/F) 590 via individual interfaces 552, 554 using interface circuits 576, 594, 586, 598. The network interface 590 (e.g., one or more of an interconnect, bus, and/or fabric, and in some examples is a chipset) may optionally exchange information with a coprocessor 538 via an interface circuit 592. In some examples, the coprocessor 538 is a special-purpose processor, such as, for example, a high-throughput processor, a network or communication processor, compression engine, graphics processor, general purpose graphics processing unit (GPGPU), neural-network processing unit (NPU), embedded processor, or the like.
A shared cache (not shown) may be included in either processor 570, 580 or outside of both processors, yet connected with the processors via an interface such as P-P interconnect, such that either or both processors' local cache information may be stored in the shared cache if a processor is placed into a low power mode.
Network interface 590 may be coupled to a first interface 516 via interface circuit 596. In some examples, first interface 516 may be an interface such as a Peripheral Component Interconnect (PCI) interconnect, a PCI Express interconnect or another I/O interconnect. In some examples, first interface 516 is coupled to a power control unit (PCU) 517, which may include circuitry, software, and/or firmware to perform power management operations with regard to the processors 570, 580 and/or co-processor 538. PCU 517 provides control information to a voltage regulator (not shown) to cause the voltage regulator to generate the appropriate regulated voltage. PCU 517 also provides control information to control the operating voltage generated. In various examples, PCU 517 may include a variety of power management logic units (circuitry) to perform hardware-based power management. Such power management may be wholly processor controlled (e.g., by various processor hardware, and which may be triggered by workload and/or power, thermal or other processor constraints) and/or the power management may be performed responsive to external sources (such as a platform or power management source or system software).
PCU 517 is illustrated as being present as logic separate from the processor 570 and/or processor 580. In other cases, PCU 517 may execute on a given one or more of cores (not shown) of processor 570 or 580. In some cases, PCU 517 may be implemented as a microcontroller (dedicated or general-purpose) or other control logic configured to execute its own dedicated power management code, sometimes referred to as P-code. In yet other examples, power management operations to be performed by PCU 517 may be implemented externally to a processor, such as by way of a separate power management integrated circuit (PMIC) or another component external to the processor. In yet other examples, power management operations to be performed by PCU 517 may be implemented within BIOS or other system software. PCU 517 may be configured to determine a boosted maximum power level based at least in part on a boost value that in turn is based on characteristics of a power supply configuration of system 500, as described herein. PCU 517 may, based at least in part on the boosted maximum power level, determine boosted power budgets for individual IP circuits, as described herein.
Various I/O devices 514 may be coupled to first interface 516, along with a bus bridge 518 which couples first interface 516 to a second interface 520. In some examples, one or more additional processor(s) 515, such as coprocessors, high throughput many integrated core (MIC) processors, GPGPUs, accelerators (such as graphics accelerators or digital signal processing (DSP) units), field programmable gate arrays (FPGAs), or any other processor, are coupled to first interface 516. In some examples, second interface 520 may be a low pin count (LPC) interface. Various devices may be coupled to second interface 520 including, for example, a keyboard and/or mouse 522, communication devices 527 and storage circuitry 528. Storage circuitry 528 may be one or more non-transitory machine-readable storage media as described below, such as a disk drive or other mass storage device which may include instructions/code and data 530. Further, an audio I/O 524 may be coupled to second interface 520. Note that other architectures than the point-to-point architecture described above are possible. For example, instead of the point-to-point architecture, a system such as multiprocessor system 500 may implement a multi-drop interface or other such architecture.
Processor cores may be implemented in different ways, for different purposes, and in different processors. For instance, implementations of such cores may include: 1) a general purpose in-order core intended for general-purpose computing; 2) a high-performance general purpose out-of-order core intended for general-purpose computing; 3) a special purpose core intended primarily for graphics and/or scientific (throughput) computing. Implementations of different processors may include: 1) a CPU including one or more general purpose in-order cores intended for general-purpose computing and/or one or more general purpose out-of-order cores intended for general-purpose computing; and 2) a coprocessor including one or more special purpose cores intended primarily for graphics and/or scientific (throughput) computing. Such different processors lead to different computer system architectures, which may include: 1) the coprocessor on a separate chip from the CPU; 2) the coprocessor on a separate die in the same package as a CPU; 3) the coprocessor on the same die as a CPU (in which case, such a coprocessor is sometimes referred to as special purpose logic, such as integrated graphics and/or scientific (throughput) logic, or as special purpose cores); and 4) a system on a chip (SoC) that may be included on the same die as the described CPU (sometimes referred to as the application core(s) or application processor(s)), the above described coprocessor, and additional functionality. Example core architectures are described next, followed by descriptions of example processors and computer architectures.
Thus, different implementations of the processor 600 may include: 1) a CPU with the special purpose logic 608 being integrated graphics and/or scientific (throughput) logic (which may include one or more cores, not shown), and the cores 602(A)-(N) being one or more general purpose cores (e.g., general purpose in-order cores, general purpose out-of-order cores, or a combination of the two); 2) a coprocessor with the cores 602(A)-(N) being a large number of special purpose cores intended primarily for graphics and/or scientific (throughput); and 3) a coprocessor with the cores 602(A)-(N) being a large number of general purpose in-order cores. Thus, the processor 600 may be a general-purpose processor, coprocessor or special-purpose processor, such as, for example, a network or communication processor, compression engine, graphics processor, GPGPU (general purpose graphics processing unit), a high throughput many integrated core (MIC) coprocessor (including 30 or more cores), embedded processor, or the like. The processor may be implemented on one or more chips. The processor 600 may be a part of and/or may be implemented on one or more substrates using any of a number of process technologies, such as, for example, complementary metal oxide semiconductor (CMOS), bipolar CMOS (BiCMOS), P-type metal oxide semiconductor (PMOS), or N-type metal oxide semiconductor (NMOS).
A memory hierarchy includes one or more levels of cache unit(s) circuitry 604(A)-(N) within the cores 602(A)-(N), a set of one or more shared cache unit(s) circuitry 606, and external memory (not shown) coupled to the set of integrated memory controller unit(s) circuitry 614. The set of one or more shared cache unit(s) circuitry 606 may include one or more mid-level caches, such as level 2 (L2), level 3 (L3), level 4 (L4), or other levels of cache, such as a last level cache (LLC), and/or combinations thereof. While in some examples interface network circuitry 612 (e.g., a ring interconnect) interfaces the special purpose logic 608 (e.g., integrated graphics logic), the set of shared cache unit(s) circuitry 606, and the system agent unit circuitry 610, alternative examples use any number of well-known techniques for interfacing such units. In some examples, coherency is maintained between one or more of the shared cache unit(s) circuitry 606 and cores 602(A)-(N). In some examples, interface controller units circuitry 616 couple the cores 602 to one or more other devices 618 such as one or more I/O devices, storage, one or more communication devices (e.g., wireless networking, wired networking, etc.), etc.
In some examples, one or more of the cores 602(A)-(N) are capable of multi-threading. The system agent unit circuitry 610 includes those components coordinating and operating cores 602(A)-(N). The system agent unit circuitry 610 may include, for example, power control unit (PCU) circuitry and/or display unit circuitry (not shown). The PCU may be or may include logic and components needed for regulating the power state of the cores 602(A)-(N) and/or the special purpose logic 608 (e.g., integrated graphics logic), including determining and communicating boosted maximum power budgets to various IP circuits, as described herein. The display unit circuitry is for driving one or more externally connected displays.
The cores 602(A)-(N) may be homogenous in terms of instruction set architecture (ISA). Alternatively, the cores 602(A)-(N) may be heterogeneous in terms of ISA; that is, a subset of the cores 602(A)-(N) may be capable of executing an ISA, while other cores may be capable of executing only a subset of that ISA or another ISA.
The following examples pertain to further embodiments.
In one example, an apparatus comprises: a first IP circuit to execute operations on data; and a power controller coupled to the first IP circuit. The power controller is to: receive a boost value, the boost value based at least in part on one or more characteristics of a power supply coupled to the apparatus; determine a boosted maximum power level for the apparatus based at least in part on the boost value and a legacy maximum power level for the apparatus, the boosted maximum power level greater than the legacy maximum power level; and provide a boosted maximum power budget for the first IP circuit based at least in part on the boosted maximum power level, the boosted maximum power budget for the first IP circuit greater than a legacy maximum power budget for the first IP circuit. The first IP circuit, in response to receipt of an input voltage violation signal, is to reactively reduce power consumption equal to or below the legacy maximum power budget for the first IP circuit.
In an example, the first IP circuit is to receive the input voltage violation signal directly from a controller coupled to the power supply.
In an example, the apparatus further comprises a hardwire path coupled to the first IP circuit to provide the input voltage violation signal directly from the controller to the first IP circuit.
In an example, the power controller is to receive the boost value from a system software.
In an example, the apparatus further comprises a configuration storage to store the boost value.
In an example, the power controller is to access the boost value from the configuration storage, and determine the boosted maximum power level comprising a sum of the boost value and the legacy maximum power level.
In an example, the first IP circuit, within less than 10 microseconds from receipt of the input voltage violation signal, is to reactively reduce the power consumption equal to or below the legacy maximum power budget for the first IP circuit.
In an example, the apparatus further comprises at least one storage to store the boosted maximum power budget for the first IP circuit and the legacy maximum power budget for the first IP circuit.
In an example, the first IP circuit is to autonomously reactively reduce the power consumption equal to or below the legacy maximum power budget for the first IP circuit, in response to receipt of the input voltage violation signal.
In an example, the apparatus further comprises a second IP circuit to execute operations on second data, where the power controller is to provide a second legacy maximum power budget for the second IP circuit, the second IP circuit to not exceed the second legacy maximum power budget.
In another example, a method comprises: receiving, in at least one IP circuit of a SoC, an indication that an input voltage to a voltage regulator coupled to the SoC has fallen below a threshold level; and in response to the indication, autonomously, in the at least one IP circuit, reducing power consumption of the at least one IP circuit to a level below a legacy maximum IP power budget, the legacy maximum IP power budget set by a power controller of the SoC.
In an example, the method further comprises receiving the indication in the at least one IP circuit via a hardwire connection to a voltage regulator controller coupled to the SoC.
In an example, the method further comprises reducing the power consumption of the at least one IP circuit to the level below the legacy maximum IP power budget within 10 microseconds from receipt of the indication.
In an example, the method further comprises: determining, in the power controller, a boosted maximum power budget for the SoC based at least in part on a boost value and a legacy maximum power budget for the SoC, the boost value based on the voltage regulator, the boosted maximum power budget greater than the legacy maximum power budget; determining a boosted maximum IP power budget for the at least one IP circuit based at least in part on the boosted maximum power budget; and sending the boosted maximum IP power budget to the at least one IP circuit.
In an example, the method further comprises: operating the at least one IP circuit with the power consumption greater than the legacy maximum IP power budget based on the boosted maximum IP power budget; receiving the indication in the at least one IP circuit; and thereafter operating the at least one IP circuit with the power consumption less than the legacy maximum IP power budget.
In an example, the method further comprises maintaining a power consumption of a second IP circuit at a level below a legacy maximum second IP power budget, the legacy maximum second IP power budget set by the power controller.
In another example, a computer readable medium including instructions is to perform the method of any of the above examples.
In a further example, a computer readable medium including data is to be used by at least one machine to fabricate at least one integrated circuit to perform the method of any one of the above examples.
In a still further example, an apparatus comprises means for performing the method of any one of the above examples.
In still another example, a system comprises: a voltage regulator to receive a system voltage and output an operating voltage; a voltage regulator controller to identify when the system voltage falls below a threshold voltage; a SoC coupled to the voltage regulator, and a non-volatile memory coupled to the SoC. The SoC to receive the operating voltage and comprises: at least one first IP circuit to execute operations on first data; and at least one second IP circuit to execute operations on second data. The SoC also includes a power controller to: receive a boost value, the boost value based at least in part on one or more characteristics of the voltage regulator; determine a boosted maximum power level for the SoC based at least in part on the boost value and a legacy maximum power level for the SoC, the boosted maximum power level greater than the legacy maximum power level; and determine a boosted maximum power budget for the at least one first IP circuit based at least in part on the boosted maximum power level, the boosted maximum power budget for the at least one first IP circuit greater than a legacy maximum power budget for the at least one first IP circuit. The at least one first IP circuit is to reduce power consumption equal to or below the legacy maximum power budget for the at least one the first IP circuit, in reaction to the input voltage falling below the threshold voltage.
In an example, the voltage regulator controller is to send a voltage violation signal to the SoC when the system voltage falls below the threshold voltage.
In an example, the system further comprises a hardwire path coupled between the voltage regulator controller and the at least one first IP circuit, the voltage regulator controller to send the voltage violation signal to the at least one first IP circuit via the hardwire path.
In an example, the at least one first IP circuit is to autonomously reduce the power consumption equal to or below the legacy maximum power budget for the at least one first IP circuit based on the voltage violation signal.
Understand that various combinations of the above examples are possible.
Note that the terms “circuit” and “circuitry” are used interchangeably herein. As used herein, these terms and the term “logic” are used to refer to alone or in any combination, analog circuitry, digital circuitry, hard wired circuitry, programmable circuitry, processor circuitry, microcontroller circuitry, hardware logic circuitry, state machine circuitry and/or any other type of physical hardware component. Embodiments may be used in many different types of systems. For example, in one embodiment a communication device can be arranged to perform the various methods and techniques described herein. Of course, the scope of the present invention is not limited to a communication device, and instead other embodiments can be directed to other types of apparatus for processing instructions, or one or more machine readable media including instructions that in response to being executed on a computing device, cause the device to carry out one or more of the methods and techniques described herein.
Embodiments may be implemented in code and may be stored on a non-transitory storage medium having stored thereon instructions which can be used to program a system to perform the instructions. Embodiments also may be implemented in data and may be stored on a non-transitory storage medium, which if used by at least one machine, causes the at least one machine to fabricate at least one integrated circuit to perform one or more operations. Still further embodiments may be implemented in a computer readable storage medium including information that, when manufactured into a SOC or other processor, is to configure the SOC or other processor to perform one or more operations. The storage medium may include, but is not limited to, any type of disk including floppy disks, optical disks, solid state drives (SSDs), compact disk read-only memories (CD-ROMs), compact disk rewritables (CD-RWs), and magneto-optical disks, semiconductor devices such as read-only memories (ROMs), random access memories (RAMs) such as dynamic random access memories (DRAMs), static random access memories (SRAMs), erasable programmable read-only memories (EPROMs), flash memories, electrically erasable programmable read-only memories (EEPROMs), magnetic or optical cards, or any other type of media suitable for storing electronic instructions.
While the present disclosure has been described with respect to a limited number of implementations, those skilled in the art, having the benefit of this disclosure, will appreciate numerous modifications and variations therefrom. It is intended that the appended claims cover all such modifications and variations.
This application claims the benefit of U.S. Provisional Application No. 63/519,628, filed on Aug. 15, 2023, and entitled “SYSTEM, METHOD AND APPARATUS FOR REACTIVE POWER CONTROL IN A PROCESSOR.”
Number | Date | Country | |
---|---|---|---|
63519628 | Aug 2023 | US |