Instructions in microprocessors are often re-dispatched for execution one or more times due to pipeline errors or data hazards. For example, an instruction may need to be re-dispatched when an instruction refers to a result that has not yet been calculated or retrieved. A miss resulting from the unavailable information may cause the microprocessor to stall. Because it is not known whether other unpredicted stalls will arise due to other misses during resolution of that miss, the microprocessor may perform a runahead operation configured to detect other misses while the initial miss is being resolved.
In modern microprocessors, architectural-level instructions are often executed in a pipeline. Such instructions may be issued individually or as bundles of micro-operations to various execution mechanisms in the pipeline. Regardless of the form that an instruction takes when issued for execution, when the instruction is issued, it is not known whether execution of the instruction will complete or not. Put another way, it is not known at dispatch whether a miss or an exception will arise during execution of the instruction.
A common pipeline execution stall that may arise during execution of an instruction is a load operation that results in a cache miss. Such cache misses may trigger an entrance into a runahead mode of operation (hereafter referred to as “runahead”) that is configured to detect, for example, other cache misses, instruction translation lookaside buffer misses, or branch mispredicts while the initial load miss is being resolved. As used herein, runahead describes any suitable speculative execution scheme resulting from a long-latency event, such as a cache miss where the resulting load event pulls the missing instruction or data from a slower access memory location. Once the initial load miss is resolved, the microprocessor exits runahead and the instruction is re-executed. Because other misses may arise, it is possible that an instruction may be re-executed several times prior to completion of the instruction.
Once the runahead-triggering event is detected, the state of the microprocessor (e.g., the registers and other suitable states) is checkpointed so that the microprocessor may return to that state after runahead. The microprocessor then continues executing in a working state during runahead. In some settings, the microprocessor may enter runahead immediately, and optionally may reissue the instruction that caused the microprocessor to enter runahead for execution. Because reissuing the instruction may take some time, the effective time that the microprocessor is able to detect new potential long latency events while in runahead may be reduced. In some other settings, such as a load miss, the microprocessor may delay entry into runahead until it can be determined whether a load miss in one cache can be satisfied by a hit in another cache in the memory hierarchy. For example, in a scenario where an instruction causes an L1 cache miss, the microprocessor may delay reissuing the instruction so that, once reissued, the instruction will line up with a hit from the L2 cache if it arrives. Put another way, in such a scenario the microprocessor will stall briefly, but does not immediately enter runahead, followed by a reissue of the instruction. Because the instruction may be reissued before it is known whether there will be a hit in the L2 cache, the microprocessor may still enter runahead if the L2 cache misses.
However, in each of the scenarios contemplated above, it is possible that an instruction may be launched without knowing whether a runahead-triggering event will result. Because some instructions may be treated differently in runahead mode than in normal mode, and because some of such differences may be applied at issuance, it can be difficult to enter runahead without reissuing the instruction that caused entry into runahead. For example, some microprocessor actions may adversely affect the microprocessor state if performed during runahead because those actions may lead to cache pollution and/or make return to the normal operating mode difficult.
Accordingly, the embodiments described herein relate to methods and hardware operative, in the event that execution of an instruction produces a runahead-triggering event, to cause a microprocessor to enter into and operate in a runahead mode without reissuing the instruction. In some examples, the embodiments described herein may carry out one or more runahead policies that govern operation of the microprocessor and cause the microprocessor to operate differently than when not in runahead while the microprocessor is in runahead. Put another way, the microprocessor may take different actions for some instructions depending on the runahead state.
For example, it will be appreciated that some actions may be prioritized differently during runahead relative to non-runahead operations, and/or that some actions may be viewed as being optional during runahead. Thus, in some embodiments, some actions may be categorized as being permissive, while other actions may be categorized being absolute.
A permissive action may be optional or reprioritized relative to another action. For example, a permissive action may be performed by the microprocessor to save power and/or enhance performance during runahead. Such alternative treatment may save processing time during runahead, as detection of an additional stall condition during runahead may be a more relevant result than a runahead calculation result, which may be invalid. In some embodiments, a permissive action may be applied to one or more instructions included in a permissive instruction category encountered during runahead though not necessarily to every instruction so-categorized that is encountered during runahead. Further, a permissive action may not be applied to an instruction included in a permissive instruction category issued prior to the detection of a runahead-triggering event.
In contrast, an absolute action may represent an action that enables proper runahead operation. Put another way, omitting or deprioritizing an absolute action may threaten proper runahead operation or return to normal operation after runahead. For example, an absolute action may include an action that preserves microprocessor correctness. As used herein, microprocessor correctness generally refers to the functional validity of the microprocessor's architectural state, so that an action that maintains the functional validity of the microprocessor's architecture maintains the correctness of the microprocessor. In some embodiments, an absolute action may be applied to every instruction included in an absolute instruction category encountered during runahead. Further, in some embodiments, an absolute action may be applied to an instruction included in an absolute instruction category issued prior to the detection of a runahead-triggering event. Applying absolute actions as described herein may preserve and protect the microprocessor's correctness.
In some settings, actions that affect microprocessor correctness may irretrievably alter the ability of the microprocessor to restart after runahead. As an example, in some embodiments, some registers of microprocessors may have a checkpointed copy from which the state that was present upon runahead entry can be recovered when restarting after a runahead episode. Since a checkpointed copy exists, writing to these registers during runahead may not interfere with restarting after runahead. However, some registers may not have a checkpointed copy. To preserve microprocessor functional correctness, writing to such registers during runahead should be avoided. Similar care may be applied to cache writes in the absence of cache protection mechanisms.
As another example, in some embodiments control registers may be included in a microprocessor that change and/or control the behavior/operation of the microprocessor's operation. In some of these settings, a change to a control register (e.g., via a write to that control register) may alter the microprocessor's behavior in a manner that is difficult to unwind at a later time. For example, a change to a control register made during runahead operation may introduce an operational change to the microprocessor that is difficult to undo, potentially causing post-runahead operation to proceed differently than would be expected had runahead not occurred. In some of such embodiments, the alteration of control registers may be prevented during runahead.
The absolute and permissive actions described above may be performed in response to respective runahead policies implemented at suitable stages of a multi-stage microprocessor pipeline so that runahead operation may commence without reissuing the runahead-triggering instruction. For example, a permissive runahead policy may be applied earlier in a multi-stage pipeline than is an absolute runahead policy on entry into runahead. In turn, optional actions may be applied to subsequently-issued instructions earlier in the pipeline, such as before entry to execution logic, as a result of permissive policy implementation on entry to runahead. Because these actions are optional, non-performance of those actions for instructions already in the execution logic because the instructions were not reissued on entry to runahead may be acceptable during runahead. Mandatory actions resulting from implementation of absolute runahead policies may be applied to all instructions in the execution logic at a later point in the pipeline. For example, applying absolute runahead policies at the exit from execution logic or at subsequent commitment or writeback logic so that all instructions potentially affected by runahead may be subjected to a suitable absolute runahead policy may avoid adverse alterations to microprocessor correctness.
In some examples, the disclosed embodiments may detect one or more instruction categories associated with instructions issued during runahead. In turn, one or more runahead policies related to a respective instruction category may be applied during runahead. Some embodiments may detect whether an instruction issued and/or executed during runahead is associated with an absolute instruction category and/or a permissive instruction category. In one scenario, an absolute runahead policy associated with an absolute instruction category may be applied before the instruction is committed. For example, potential corruption of the checkpointed state of the microprocessor from an improper writeback event during runahead may be prevented in a setting where microprocessor correctness may be affected by commitment of the instruction. In another scenario, a permissive runahead policy associated with a permissive instruction category may be applied before the instruction is issued and/or executed. If applied, a power/performance benefit may be realized by the microprocessor immediately upon issuance to the execution logic.
A memory controller 110G may be used to handle the protocol and provide the signal interface required of main memory 110D and to schedule memory accesses. The memory controller can be implemented on the processor die or on a separate die. It is to be understood that the memory hierarchy provided above is non-limiting and other memory hierarchies may be used without departing from the scope of this disclosure.
Microprocessor 100 also includes a pipeline, illustrated in simplified form in
As shown in
In some embodiments, scheduling logic 124 may be configured to schedule instructions for execution in the form of instruction set architecture (ISA) instructions. Additionally or alternatively, in some embodiments, scheduling logic 124 may be configured to schedule bundles of micro-operations for execution, where each micro-operation corresponds to one or more ISA instructions or parts of ISA instructions. It will be appreciated that any suitable arrangement for scheduling instructions in bundles of micro-operations may be employed without departing from the scope of the present disclosure. For example, in some embodiments, a single instruction may be scheduling in a plurality of bundles of micro-operations, while in some embodiments a single instruction may be scheduling as a bundle of micro-operations. In yet other embodiments, a plurality of instructions may be scheduling as a bundle of micro-operations. In still other embodiments, scheduling logic 124 may schedule individual instructions or micro-operations, e.g., instructions or micro-operations that do not comprise bundles at all.
As shown in the embodiment depicted in
The detected category may be used to determine one or more runahead policies governing how the microprocessor is to be operated while executing the associated instruction in runahead, as explained in more detail below. It will be appreciated that detection logic 126 may detect instruction categories during any suitable portion of microprocessor operations. For example, in some embodiments, detection logic 126 may detect instruction categories without regard to whether microprocessor 100 is operating in runahead mode. In such embodiments, microprocessor 100 may be able to apply appropriate runahead policies to instructions even after those instructions have been issued for execution. In some other embodiments, detection logic 126 may be configured to detect instruction categories during runahead mode alone.
While
As shown in
The embodiment of microprocessor 100 shown in
In some embodiments, permissive logic 131 and absolute logic 132 may communicate with pipeline 102 and runahead control logic 130 so that respective runahead policies may be implemented at different stages of pipeline 102. In turn, a permissive runahead policy may be applied earlier in pipeline 102 than an absolute runahead policy on entry into runahead. For example, permissive logic 131 may instruct scheduling logic 124 to apply a permissive runahead policy to an instruction prior to issuance during runahead. In turn, execution of that instruction may be enhanced in runahead as a result of power and/or performance management actions taken by microprocessor 100 when executing the instruction. As another example, absolute logic 132 may instruct writeback logic 134, configured to commit the results of execution operations to an appropriate location (e.g., register 109), to prevent one or more writeback actions during runahead. In turn, writeback logic 134 may prevent cache corruption that may result from alteration during runahead, as described below.
In some embodiments, runahead control logic 130 may also control memory operations related to entry and exit from runahead. For example, on entry to runahead, portions of microprocessor 100 may be checkpointed to preserve the state of microprocessor 100 while a non-checkpointed working state version of microprocessor 100 speculatively executes instructions during runahead. Non-limiting examples of portions of microprocessor 100 that may be checkpointed during runahead include buffers (not shown), registers 109, and states for execution logic 128. In some of such embodiments, runahead control logic 130 may restore microprocessor 100 to the checkpointed state on exit from runahead.
It will be understood that the above stages shown in pipeline 102 are illustrative of a typical RISC implementation, and are not meant to be limiting. For example, in some embodiments, the fetch logic and the scheduling logic functionality may be provided upstream of a pipeline, such as compiling VLIW instructions or code-morphing. In some other embodiments, the scheduling logic may be included in the fetch logic and/or the decode logic of the microprocessor. More generally a microprocessor may include fetch, decode, and execution logic, each of which may comprise one or more stages, with mem and write back functionality being carried out by the execution logic. The present disclosure is equally applicable to these and other microprocessor implementations, including hybrid implementations that may use VLIW instructions and/or other logic instructions.
In the described examples, instructions may be fetched and executed one at a time, possibly requiring multiple clock cycles. During this time, significant parts of the data path may be unused. In addition to or instead of single instruction fetching, pre-fetch methods may be used to enhance performance and avoid latency bottlenecks associated with read and store operations (e.g., the reading of instructions and loading such instructions into processor registers and/or execution queues). Accordingly, it will be appreciated that virtually any suitable manner of fetching, scheduling, and dispatching instructions may be used without departing from the scope of the present disclosure.
Continuing with
As an illustrative example of how method 200 may be performed,
For example, because it was not known that Instruction A would trigger entry into runahead prior to issuance, runahead policies are not applied at issuance to Instruction A and all of the instructions issued subsequent to Instruction A, as indicated by an uncertainty window shown in
Continuing with
For example, runahead policies may cause a microprocessor to treat some instructions differently and take alternative actions regarding those instructions than would otherwise be taken outside of runahead. Moreover, various runahead policies associated with respective instructions may cause the microprocessor to treat the respective instructions differently from one another during runahead. Such differences in treatment may be based on differences among the respective instructions and/or potential consequences to the microprocessor.
In the embodiment shown in
As introduced above, some actions may be viewed as having differing relative priorities during runahead, so that some actions may be categorized as being permissive, while other actions may be categorized being absolute. Accordingly, in some embodiments, determining whether an instruction falls into a first category may include identifying whether the instruction is associated with a permissive instruction category. Non-limiting examples of permissive instruction categories include a microprocessor power management instruction category and a microprocessor performance management category. Further, in some embodiments, determining whether an instruction falls into a first category may include identifying whether the instruction is associated with an absolute instruction category. One non-limiting example of an absolute instruction category includes a microprocessor correctness instruction category.
Because the operational stability of the microprocessor may be affected by potential runahead actions, operating the microprocessor according to a runahead policy during runahead at 214 includes, at 216, controlling operation of the microprocessor in accordance with the first instruction category. For example, in some embodiments, scheduling, executing, or retiring the instruction associated with the first instruction category may be controlled according to the first instruction category. Additionally or alternatively, in some embodiments, scheduling, executing, or retiring a different instruction may be controlled according to the first instruction category.
In some embodiments, controlling operation of the microprocessor in accordance with the first instruction category at 216 may include applying a permissive runahead policy to the microprocessor. For example, if the first instruction category is associated with a permissive action, a permissive runahead policy may be applied to the microprocessor.
Application of a permissive runahead policy may enhance microprocessor operation in runahead for some instructions by improving the efficiency with which those instructions may be executed in the pipeline. In the example shown in
While this example is related to a policy that is performed prior to issuance, it will be appreciated that suitable permissive logic may communicate with the pipeline and/or the execution logic at one or more suitable locations. For example, permissive logic that includes logic related to power and performance management runahead policies may communicate with one or more early stages of the execution logic. Providing additional communication between early stages of the execution logic may permit application of permissive runahead policies to instructions already in the execution logic after runahead is triggered (e.g., within the uncertainty window), potentially providing additional runahead operational efficiency.
As introduced above, permissive runahead policies may lead to more efficient operation of the microprocessor during runahead. In some embodiments, application of a permissive runahead policy may cause the microprocessor to convert a selected instruction from a first type to a second type. Such embodiments may be examples of actions associated with a microprocessor power management instruction category.
For example, application of a permissive runahead policy may cause a floating point operation instruction to be converted to a non-operational instruction. Conversion of a floating point operation instruction to a non-operational instruction may save power and/or time during runahead, as floating point operation instructions typically are not used to compute an address or resolve a branch or otherwise uncover potential stalls and misses during runahead. In some embodiments, application of a permissive runahead policy may cause the microprocessor to poison a destination for a selected instruction. For example, if a floating point operation instruction is converted to a non-operational instruction, an integer instruction that is seeded with floating point data (e.g., an instruction that uses floating point data as input) from the converted instruction will likely yield an invalid result. Poisoning the destination register for the floating point data-seeded instruction (the integer instruction in this example) may reduce potential cache pollution.
In some embodiments, application of a permissive runahead policy may cause the microprocessor to suppress trap or fault conditions for instructions having poisoned source registers. Such embodiments may be examples of actions associated with a microprocessor performance management instruction category. Because traps and faults typically halt microprocessor operation, encountering a trap or fault may shorten time in runahead. Suppressing trap/fault conditions during runahead may enhance microprocessor performance by providing additional opportunities for branches to be resolved and misses to be exposed.
While a microprocessor may take some actions in runahead to enhance operation, in some settings a microprocessor may be required to perform some actions to preserve and protect the functional stability and correctness of the microprocessor. In some embodiments, controlling operation of the microprocessor in accordance with the first instruction category at 216 may include applying an absolute runahead policy to the microprocessor. For example, if the first instruction category is associated with an absolute action, an absolute runahead policy may be applied to the microprocessor to prevent an action that may affect microprocessor correctness.
Because the actions that preserve microprocessor correctness are typically associated with commit, writeback, or other memory operations, such operations often occur near the end of the execution logic. For example, an input/output operation is typically performed late in the execution logic, as are operations that may update or otherwise affect the architectural state of the microprocessor. Thus, the runahead-triggering event is often an instruction that has not reached such operations. In turn, the instructions issued to the execution logic after that instruction are also unlikely to have reached those operations. Accordingly, on detection of runahead, absolute runahead policies may be applied to any instruction that emerges from the execution logic, or to any instruction arriving at an operation that may affect microprocessor correctness, after runahead is detected.
In the example shown in
In some embodiments, application of an absolute runahead policy may cause the microprocessor to prevent alterations to a committed state of the microprocessor during runahead. For example, an absolute runahead policy may prevent updates to a non-checkpointed state of the microprocessor during runahead, potentially facilitating a trusted reversion to the original state after runahead. As another example, an absolute runahead policy may prevent memory operations that may have architectural effects other than those described in the example above from occurring during runahead, such as input/output operations, writeback operations, and the like. In some settings, an absolute runahead policy may prevent alterations to a memory system of the microprocessor that affect the microprocessor architectural state.
It will be appreciated that an instruction may fall into more than one instruction category, so that a plurality of runahead policies may be applied to the instruction as the instruction is executed. For example, permissive and absolute runahead policies may be applied to the instruction during runahead. Thus, in some embodiments, operating the microprocessor according to a runahead policy during runahead at 210 may include, at 218, determining whether that instruction falls into a selected category, and, at 220, controlling execution of that instruction in accordance with the second instruction category. For example, a second suitable runahead policy may be applied to the instruction according to the second instruction category.
Once the condition that caused the microprocessor to enter runahead is resolved, the microprocessor may exit runahead. Thus, method 200 includes causing the microprocessor to exit runahead at 222. Typically, the microprocessor re-enters normal operation by returning to the checkpointed state and reissuing the instruction that triggered runahead.
It will be appreciated that methods described herein are provided for illustrative purposes only and are not intended to be limiting. Accordingly, it will be appreciated that in some embodiments the methods described herein may include additional or alternative processes, while in some embodiments, the methods described herein may include some processes that may be reordered or omitted without departing from the scope of the present disclosure. Further, it will be appreciated that the methods described herein may be performed using any suitable hardware including the hardware described herein.
This written description uses examples to disclose the invention, including the best mode, and also to enable a person of ordinary skill in the relevant art to practice the invention, including making and using any devices or systems and performing any incorporated methods. The patentable scope of the invention is defined by the claims, and may include other examples as understood by those of ordinary skill in the art. Such other examples are intended to be within the scope of the claims.