An embodiment of the invention relates to power management in a computer system, and, in particular, to controlling the power consumption of an electronic device such as a processor. Other embodiments are also described.
Power consumption in computer systems tends to increase every generation. It is becoming increasingly important to properly manage the power consumption of individual electronic devices of a computer system. This is especially true with advanced high performance processors, also known as central processing units or CPUs, which are becoming larger and have greater transistor density, making it difficult to dissipate the heat that they produce while running at elevated clock frequencies. A processor may have several functional units such as a cache, a bus interface, a register file, an arithmetic logic unit, a floating point unit, a single instruction multiple data execution unit, and a multiple instruction multiple data execution unit. Each of these units consumes power, both during active operation, as well as while being idle.
Several methods have been employed to manage and therefore limit the power consumption of a processor to meet a given power envelope. For example, since power consumption is proportional to the frequency of the clock that sequences operation of the processor, some power management techniques concentrate on reducing the processor clock speed during periods of inactivity or when the operations performed by the processor do not require speedy execution. Such methods predict, during execution of a program, when the functional units will be idling during execution of a program, and then reduce the clock frequency or supply voltage to an appropriate level. This may require that the functional units be monitored by the processor during program execution.
Other methods simply shut down large portions of the system in response to a keyboard idle timer expiring, indicating that the system is likely not being used as heavily, therefore justifying a partial or complete shutdown of certain functional units.
Yet another method is referred to as compiler assisted power management. That technique recognizes that the electronic instructions executed by the functional units of a computer system are derived from computer programs, such as software applications, operating systems, etc., by a compiler. The compiler translates the high level operations described in a computer program and organizes the translated operations into a sequence of low level instructions. These instructions are then packaged sequentially into an executable file that can be loaded into computer memory, and executed by the functional units of a processor. Compiler assisted power management capitalizes on the awareness of the processor's internal architecture by the compiler, and uses that knowledge to generate hints or suggestions in the form of power-control instructions that are embedded in the resulting, translated sequence of instructions. These instructions can be used to power up functional units so that they are ready to execute when necessary. The instructions may also be used to reduce or turnoff power consumption in certain functional units that are not in use, or that are idling. The placement of these instructions is based upon an analysis of the computer program and the resulting instructions, at the translation stage, relieving the processor and other electronic devices of the need to make decisions about when to power down certain functional units. Of course, to take advantage of these power controlling instructions, the processor needs to have the appropriate internal abilities, including hardware and/or microcode capability, to recognize and implement the power down or power up requests that it encounters while executing a sequence of instructions.
The embodiments of the invention are illustrated by way of example and not by way of limitation in the figures of the accompanying drawings in which like references indicate similar elements. It should be noted that references to “an” embodiment of the invention in this disclosure are not necessarily to the same embodiment, and they mean at least one.
A method and apparatus for compiler-assisted power management is described here that uses special instructions. Beginning with
The device 102 has a number of functional units, such as those shown in
In accordance with an embodiment of the invention, the electronic device shown in
In addition to this power management capability, an embodiment of the invention modifies the instruction decode (ID) unit 114 of a processor, so that it can detect special instructions that have been inserted into the sequence of processor instructions that constitute the program or translated code being executed. The special instruction may be one that does not affect the result of any computation in the generated instructions. In other words, the computation results (from executing the surrounding instructions) would be the same, whether or not the special instruction were present. An example is to modify the data structure for a conventional no-operation (NOP) instruction, to also indicate a power control operation for a particular functional unit of the processor. The modified data structure should still be recognizable as a NOP instruction.
For example, in the case of an IA-32 ISA compliant processor, in addition to detecting that an opcode of an instruction refers to a conventional, ISA NOP instruction, the ID unit 114 would also be able to detect that an operand of that instruction is indicating a request to either power up or power down a selected one of the functional units of the processor.
In
One or more special NOPs 208 are added to the generated, processor instructions 206. A special NOP may indicate a power down operation to reduce power consumption by its corresponding functional unit. Such special NOPs 208 are also compatible with another processor, processor B, that is not capable of the power down operation. Processor B may be a previous generation of processor A, compatible with the same ISA. In other words, the processor instructions 206, with the added special NOPs 208, can be executed by two kinds of processors, namely one that has power management capability associated with the special NOPs, and one that does not. An instruction is said to be “compatible” with the processor if it is not an invalid or illegal instruction. Note that in this case, the addition of the special instructions yields the same computation results, due to “no operation” being added, though perhaps with somewhat different delays.
The analysis of the program to determine whether a particular functional unit is used may be completely automated, for example, by the translator repeatedly scanning the entire generated code for the presence of instructions that access each functional unit. However, a provision may be made to allow the translator to accept instructions from the user of the translator, to “manually” add the special instructions to certain parts of the code. For example, this may be a compiler directive, such as a pragma statement, that is placed by the user either at a high level or at a low level version of the program, and that instructs the compiler to insert the selected special instruction.
Turning now to
If the compiler detects that floating point type instructions will not be used for a considerable period of time, by a certain portion of the code to be executed, it may insert a power down NOP immediately after the last instance of an instruction that uses the FP unit. A power up NOP may also be inserted, to “wake up” the FP unit (early enough so that the FP unit is ready to execute the next instance of a floating point instruction).
As mentioned above, the portion 306 could be a program loop, but alternatively, it may be the entire code for a particular high level function or routine. As anther alternative, the portion 306 may be a non-loop region, inside a routine. For better overall efficiency, if a particular functional unit requires a relatively long period of time (e.g., measured in terms of processor cycles) to resume full power operation, then it may be more efficient to insert the corresponding NOPs around only the larger chunks of code (or those that are executed many times, in the case of a loop). That is because, for smaller sections of code, such as only a handful of instructions that are not executed repeatedly as part of a hot loop, the delay associated with putting to sleep and/or waking up one or more functional units may reduce overall performance, while gaining little in terms of a reduction in power consumption.
Turning now to
Modifying the operand field to obtain the special instruction is a flexible technique and lends itself to change and upgrades. The operand 408 may be used to differentiate between many different types of functional units and their corresponding power down and power up operations. In addition, because of the relatively large number of bits in the operand field of a NOP instruction (e.g., 21 bits for that of the ITANIUM ISA), many more levels of “sleep” states may be added into future generations of the processor.
As an example, “nop.f 0XF” may instruct the processor to “put floating point unit to sleep”, while “nop.f 0X1” may mean “wake up floating point unit”. Note that there may also be different levels of sleep states for a given functional unit. For example, the operand 0XF may signal the processor to place its floating point unit in “light sleep”, while 0XFF may signal “medium sleep”, and 0XFFF may signal “deep sleep”. These different levels of sleep states may refer to one or more combinations of power saving operations such as reduction in frequency or even shutting off of a sequencing clock, and reduction or even shutting off a supply voltage. According to an embodiment of the invention, a compiler may be written to have this knowledge of the power down and power up capabilities that have been built into the processor, for certain individual functional units. Overall power consumption may therefore be better controlled, using the compiler which has a wider view of the code being executed, than a purely hardware or low level decision mechanism that sees only smaller chunks of code at a time. This technique can also supplement existing hardware techniques for power savings.
An embodiment of the invention may be a machine readable medium having stored thereon instructions which program a computer system to perform some of the operations described above, e.g. scanning generated instructions to determine whether a selected one of the functional units of the processor are accessed. In other embodiments, some of these operations might be performed by specific hardware components that contain hardwired logic. Those operations might alternatively be performed by any combination of programmed computer components and custom hardware components.
A machine-readable medium may include any mechanism for storing or transmitting information in a form readable by a machine (e.g., a computer), not limited to Compact Disc Read-Only Memory (CD-ROMs), Read-Only Memory (ROMs), Random Access Memory (RAM), Erasable Programmable Read-Only Memory (EPROM), and a transmission over the Internet.
The invention is not limited to the specific embodiments described above. An example special instruction was described above as a modified version of a conventional NOP instruction. However, any other instruction that remains backward compatible (for example, with earlier generation processors), and does not alter the results of the program's computations, despite being modified to indicate a power up or power down operation, may be used. The power control operation could be encoded into the operand, and not the opcode (assuming, of course, that such a modified instruction would be recognized by previous generation processors, or by processors that do not have the power control capability, because of the familiar opcode). Accordingly, other embodiments are within the scope of the claims.