Some micro-processing systems support the use of native translations of non-native ISA instructions. Typically these native translations cover several non-native instructions, perhaps even hundreds or thousands of non-native instructions. The native translations may employ various optimizations or other techniques to provide a performance benefit relative to that available through non-translated execution of the corresponding non-native ISA instructions. The performance benefit of an optimized native translation is proportional to the number of times the non-translated code would have been executed absent the translation. Since there is a performance overhead for creating the translation, it is desirable to target frequently-executed code so that the overhead can be amortized.
The present disclosure provides systems and methods that may be used to support creation of translations of portions of non-native ISA code. The example micro-processing systems herein may use a processing pipeline having an on-core hardware decoder (HWD) that receives and decodes non-native instructions into native instructions for execution. When the HWD is used in this manner, the disclosure will refer to this as the “hardware decoder mode” of execution.
The examples herein also may employ a “translation mode” of execution. In this mode, native translations are retrieved and executed without use of the HWD, for example by scheduling and dispatching the translations to one or more execution units. A native translation may cover and provide substantially equivalent functionality for any number of portions of corresponding non-native ISA code. The corresponding native translation is typically optimized to some extent relative to what would be achieved if the corresponding non-native code were to be executed using the HWD. A variety of optimizations and levels of optimization may be employed.
When the system is operating in the hardware decoder mode, the system may dynamically change and update a code portion profile in response to use of the HWD to execute portions of non-native ISA code. In certain embodiments, the code portion profile is stored in an on-core micro-architectural hardware structure, to enable rapid and lightweight profiling of code being processed with the HWD. The code portion profile may then be used in various ways to assist the process of dynamically forming new native translations.
In some examples, the code portion profile includes a plurality of records that are each associated with a portion of non-native ISA code that has been executed using the HWD. Records may be dynamically added as the code portions are processed by the HWD. From time to time, these records may be sampled and processed, for example by using software referred to herein as a “summarizer.” The result is a summarized representation of code portion control flow involving the HWD, which may be used to guide formation of new native translations. In some examples, the summarized representation is reflected in a control flow graph. In any case, when appropriately employed, the systems and methods herein improve the process of identifying code portions that should be covered in new translations. In other words, any quantum of non-native ISA code may include portions that are poor candidates for translation, and portions that are good candidates for translation. The examples herein improve the process of identifying whether a code portion is a good candidate for translation, relative to other code portions that might be included in a translation.
The microprocessor further includes a processing pipeline which typically includes one or more of fetch logic 128, HWD 130, execution logic 132, mem logic 134, and writeback logic 136. Fetch logic 128 retrieves instructions from one or more of locations 110 (but typically from either unified or dedicated L1 caches backed by L2-L3 caches and main memory).
When the system is in the above-referenced hardware decoder mode, HWD 130 decodes non-native ISA instructions, for example, by parsing opcodes, operands, and addressing modes. The outputs of the HWD are native instructions that are then executed by the execution logic. In the translation mode, native translations are retrieved and executed without needing to use the HWD. The native instructions output by the HWD will in some cases be referred to as non-translated instructions, to distinguish them from the native translations that are executed in the translation mode without use of the HWD. Native translations may be generated in a variety of ways. In some examples, a dynamic binary translator is employed to dynamically generate translations, though the present disclosure is applicable to other translation methods.
It should be understood that the above five stages are somewhat specific to, and included in, a typical RISC implementation. More generally, a microprocessor may include fetch, decode, and execution logic, with mem and writeback functionality being carried out by the execution logic. The present disclosure is equally applicable to these and other microprocessor implementations.
System 200 includes an on-core processing pipeline 202 including an HWD 204 and execution logic 206. In hardware decoder mode, non-native ISA instructions 208 are decoded by the HWD which in turn outputs non-translated native instructions 210 for execution by the execution logic. In translation mode, native translations 212 are retrieved from instruction memory and executed without using the HWD.
System 200 includes a branch count table 220 and a branch history table 222, both of which typically are implemented as micro-architectural hardware structures on a processor core or die (e.g., on the same core as HWD 204 and execution logic 206). The contents of the branch count table and the branch history table change as non-native code portions are processed by the HWD. Among other things, the branch history table may include a code portion profile 224 having information that changes dynamically as the HWD processes portions of non-native instructions. This code portion profile is used to form new native translations.
The branch count table and the branch history table each include a plurality of records (i.e., records 226 and 228). In both cases, the records contain information about non-native code portions encountered by HWD 204 as branch instructions are processed. In general, the branch count table tracks the number of times a branch target address is encountered, while the branch history table records information about the taken branch when a branch target address is encountered.
System 200 typically will include micro-architectural logic for adding and updating records in the branch count table and branch history table. This logic may be a distinct component or distributed within various of the example pipestages shown in
As shown in
It should be understood that the record depicted in
The functions of the branch count table and branch history table are to collect information about the targets of taken branches where the HWD is somehow involved in the processing of the branch instruction. Accordingly, in some examples, records will not be recorded for target addresses that have or are part of a corresponding native translation, since execution in that circumstance typically will not involve the HWD, and there is thus no need, or less of a need, to profile execution since a translation already exists. For example if the system had a native translation for a non-translated portion of code starting at a given branch target address, then the system could be configured so that that branch target address does not have an associated record in the branch count table or the branch history table.
In some examples, the existence of a translation may be determined using an on-core hardware redirector 282, also known as a THASH. The hardware redirector is a micro-architectural structure that includes address information sufficient to allow the processing pipeline to retrieve and execute a translation associated with a non-native portion of code via address mapping. Specifically, when the processing pipe branches to a target address of a non-native portion of ISA code, the address is looked up in the hardware redirector. The address provided to the hardware redirector may be generated, as indicated, via a calculation performed by execution logic 206. In the event of a hit, the lookup returns the address of an associated translation, which is then fetched and executed in translation mode without the HWD. The THASH lookup may therefore act as a screen on whether to add/update records in the branch count table and branch history table. In particular, a THASH hit means that there is already a translation for the non-native target IP, and there is thus no need to profile hardware decoder execution of that portion of target code.
In any case, when the HWD first encounters a branch target address in the execution of a code portion, a record for the branch target address is added to the branch count table and an initial value is inserted into the count 226b. Alternatively, if a record already exists for the target address, the count 226b is incremented or decremented, as appropriate to the implementation. When a record is added to or updated in the branch count table, a record for that same branch target address is added to the branch history table 222.
In one example, the branch history table is implemented as a circular buffer. Each record contains attributes of taken branches as indicated above. When the buffer is full, subsequent writes simply erase the oldest entry, and a top-of-stack pointer wraps around. Again, it should be noted that records may be updated and inserted in one or more of the three following cases: (1) where the branch target address is an exit from a translation; (2) where the branch target address is an entrance into a translation; and (3) where the branch jumps from a non-translated portion (HWD mode) to another non-translated portion. Restated, in a system that defines code portions by branches, it may be of interest to profile branches from HWD mode to translation mode, and vice versa, as well as branches between non-translated portions of code (i.e., where the pipeline remains in HWD mode for both the source and target code portions.)
Referring again to
Continuing with
In the depicted example, the summarizer includes a foreground summarizer 240a which may be implemented as a lightweight event handler that is triggered when a record in the branch count table produces an event (e.g., the counter for the record saturates). In other words, branch count table 220 produces an event, and summarizer 240a handles the event. The counts maintained in the branch count table for a target address are used to control how many times the associated code portion will be encountered before an event is taken for that code portion. As described in more detail below, one of the summarizers may control the counter values for the records of the branch count table.
In the depicted example, foreground summarizer 240a handles the event by sampling one or more records 228 from the branch history table and placing information about those records in queue 242 for subsequent processing by background summarizer 240b. For example, if the branch count table triggers an event when an overflow occurs for a branch target, the foreground summarizer may then sample the corresponding record for that branch target that was added to the branch history table. In some cases, the foreground summarizer will also sample one or more adjacent entries in the branch history table. For example, the foreground summarizer may sample the immediately prior record in order to identify the code portion that branched into the portion beginning with the respective branch target address. The foreground summarizer may also sample the subsequent entry to identify control flow out of the portion.
In the depicted example, queue 242 contains records 244, and as shown in the example record 244 of
To facilitate lightweight operation of the foreground sampling, the branch count table and branch history table typically will be implemented to allow fast reading and fast recognition of triggers from the branch count table. In one example, a streaming 64-bit read capability is provided to allow the foreground summarizer to quickly obtain the necessary information about branch history and queue it for subsequent processing, e.g., by the background summarizer. Generally, it will be desirable that the foreground summarizer be implemented so as to obtain the desired information from the branch history table while minimally impeding forward architectural progress.
Background summarizer 240b is implemented as a background processing thread that processes records 244 of queue 242. In some cases, it will be desirable to run the background summarizer on another core, for example core 104 of
In some cases it will be desirable to allocate a portion of system memory as secure and private, so that it is invisible to the user/ISA. Various data and software may run and be stored in the secure memory allocation. In some embodiments, for example, one or more of summarizers 240a and 240b, queue 242, MBHT 260, region former 270, translator 272 and trace cache 280 reside in and/or run from a private/secure portion of memory.
Each node in the control flow graph corresponds to and contains information about a portion of non-native ISA code starting at a branch target address specified in one of the records that are processed from queue 242 (
Each node may have multiple inbound edges 604 to its entry point. An edge into Node A is a result of two adjacent queued records processed from the BHT. Referring to the
Each node may also have a number of exit points defined by the occurrence of outbound edges 606. As with inbound edges, outbound edges occur as a result of processing temporally adjacent records from the BHT. For example, Node A shows an outbound edge to Node D. Node D is not explicitly shown in the figure, but the node is present as a result of processing a BHT record for D's branch target address. Similarly, the figure implies but does not show Nodes E and F.
Continuing with the A-to-D edge, the outbound edge is added to Node A or updated when a processed BHT record for D immediately follows a processed BHT record for A, reflecting a taken branch from the portion of code associated with Node A to the portion of code associated with Node D. Similarly the outbound edge to E occurs or is updated as a result of adjacent records for A and E, reflecting control flow from the portion of code starting with A's target address to the portion of code starting with E's target address.
As shown in the example of Node A, each node in the control flow graph may be given a score 610. One component of the score is the number of times the node is encountered. As more records relating to Node A are processed, the score may be increased. This generally correlates with increased use of the HWD in connection with Node A's code portion, thus increasing the potential value of having the code portion be covered by a new native translation to shift processing away from the hardware decoder mode. In general, a higher score reflects the code portion being prioritized relatively higher for inclusion in a new native translation to be formed.
The score and prioritization for a node may also be based on the type of branch by which the node was entered. Referring back to
Branch type may play into the scoring heuristic in various ways. Tracking of calls and returns can facilitate tracking of nesting during the summarization process. This may help in avoiding cluttering the MBHT during the forming of or other processing relating to a native translation. In some cases, return targets may be scored/prioritized lower, even though they are indirect branches. In some cases this is because the native target typically is available on the hardware return stack and won't require a reference to hardware redirector 282. Call branches may correspond to frequently-called subroutines, and for this or other reasons it may be desirable to score nodes entered via calls higher.
Identifying whether or not the branch is a transition may be used to suppress creation of an edge in the MBHT. If the control flow is an exit from a native translation into the hardware decoder mode, there may not be any direct path from the previous BHT entry into the current entry. It might thus not be desirable, for purposes of profiling control flow between code portions processed by the HWD, to create an edge between nodes where an interposed native translation executes prior to entry into the second node.
As seen in
In one example, the sequentially next instruction information includes the full address of the next instruction after the branch instruction. In another example, the information includes only a portion of the next instruction address. Specifically, the least significant bits of the address of the next instruction may be included. These may then be combined with the high bits of the previous branch target to deduce the full address of the sequentially next instruction following the branch instruction. Using fewer bits in this manner may reduce the footprint of the BHT records and allow for faster queuing and processing of BHT records.
Knowledge of the fall-through path may be used in various ways. One use of the fall-through path is simply to have the path be clearly known so that the region former can retrieve instructions for the path that will be translated. Another use is to calculate the offset of each edge in the control flow graph. In particular, Node A in
As shown in the A-to-D edge, each outbound edge from a node may include a weight 620. Weight may be based on the number of times the associated branch is encountered and/or the number of times it is taken, or any other suitable metric. These edge weights provide a representation of control flow out of one code portion and into another, and the individual weighting may be used by the region former to form new native translations, and more specifically to decide what code paths to translate.
The previously-discussed ordering of edges facilitates determining edge weights by allowing a counting of the number of times an edge is encountered but not taken. For example, if the processed BHT records include an A-to-E edge, that would mean that the branch instruction associated with the A-to-D edge was encountered but not taken. If the processed records include an A-to-D edge, that would mean that the branch was encountered and taken. If the A-to-D edge is encountered and taken more frequently than the A-to-E edge, the A-to-D edge may be weighted more heavily than the A-to-E edge. The region former may use this information, for example, to preferentially form a translation starting with Node A that flows through to Node D, as opposed to a translation that flows from Node A to Node E.
In another example, assume that the A-to-D edge is encountered very frequently relative to other outbound edges, but is taken only half the time. In such a circumstance, the region former may operate to create a translation covering the taken path to Node D and the fall-through path.
Referring again to
Referring to
At 702 the method includes using a HWD to execute portions of code portions of a non-native ISA. The code portions may be defined and identified by branches (taken or not taken) as in the above examples or using any other suitable characteristic or definition. The goal in general is to characterize code portions executed in hardware decoder mode in order to identify optimal code portions for translation. In the depicted method, at 704, a code portion profile is stored in hardware, and is dynamically updated in response to and based on the use of the hardware decoder at step 702. At 706, the method then includes forming new native translations based on the code portion profile.
The code portion profile of step 702 may include a plurality of records, as described above. Each record may be associated with a code portion being executed using an HWD. These records may then be sampled and processed to generate a summarized representation of how those portions are being executed with the hardware decoder, and how program control flow links those code portions. The summarized representation may be generated using summarizing software, as in the above examples, and may take the form of a control flow graph, such as the graph described in connection with
It will be appreciated that methods described herein are provided for illustrative purposes only and are not intended to be limiting. Accordingly, it will be appreciated that in some embodiments the methods described herein may include additional or alternative processes, while in some embodiments, the methods described herein may include some processes that may be reordered, performed in parallel or omitted without departing from the scope of the present disclosure. Further, it will be appreciated that the methods described herein may be performed using any suitable software and hardware including the specific examples described herein.
This written description uses examples to disclose the invention, including the best mode, and also to enable a person of ordinary skill in the relevant art to practice the invention, including making and using any devices or systems and performing any incorporated methods. The patentable scope of the invention is defined by the claims, and may include other examples as understood by those of ordinary skill in the art. Such other examples are intended to be within the scope of the claims.
Number | Name | Date | Kind |
---|---|---|---|
3815101 | Boss et al. | Jun 1974 | A |
3950729 | Fletcher et al. | Apr 1976 | A |
4654790 | Woffinden | Mar 1987 | A |
4797814 | Brenza | Jan 1989 | A |
4812981 | Chan et al. | Mar 1989 | A |
5123094 | MacDougall | Jun 1992 | A |
5179669 | Peters | Jan 1993 | A |
5245702 | McIntyre et al. | Sep 1993 | A |
5278962 | Masuda et al. | Jan 1994 | A |
5414824 | Grochowski | May 1995 | A |
5446854 | Khalidi et al. | Aug 1995 | A |
5487146 | Guttag et al. | Jan 1996 | A |
5526504 | Hsu et al. | Jun 1996 | A |
5649102 | Yamauchi et al. | Jul 1997 | A |
5649184 | Hayashi et al. | Jul 1997 | A |
5696925 | Koh | Dec 1997 | A |
5721855 | Hinton et al. | Feb 1998 | A |
5870582 | Cheong et al. | Feb 1999 | A |
5949785 | Beasley | Sep 1999 | A |
5956753 | Glew et al. | Sep 1999 | A |
5963984 | Garibay, Jr. et al. | Oct 1999 | A |
5974543 | Hilgendorf | Oct 1999 | A |
5999189 | Kajiya et al. | Dec 1999 | A |
6012132 | Yamada et al. | Jan 2000 | A |
6031992 | Cmelik et al. | Feb 2000 | A |
6091897 | Yates et al. | Jul 2000 | A |
6091987 | Thompson | Jul 2000 | A |
6118724 | Higginbottom | Sep 2000 | A |
6297832 | Mizuyabu et al. | Oct 2001 | B1 |
6298390 | Matena et al. | Oct 2001 | B1 |
6362826 | Doyle et al. | Mar 2002 | B1 |
6457115 | McGrath | Sep 2002 | B1 |
6470428 | Milway et al. | Oct 2002 | B1 |
6499090 | Hill et al. | Dec 2002 | B1 |
6519694 | Harris | Feb 2003 | B2 |
6549997 | Kalyanasundharam | Apr 2003 | B2 |
6636223 | Morein | Oct 2003 | B1 |
6658538 | Arimilli et al. | Dec 2003 | B2 |
6711667 | Ireton | Mar 2004 | B1 |
6714904 | Torvalds et al. | Mar 2004 | B1 |
6742104 | Chauvel et al. | May 2004 | B2 |
6751583 | Clarke et al. | Jun 2004 | B1 |
6813699 | Belgard | Nov 2004 | B1 |
6823433 | Barnes et al. | Nov 2004 | B1 |
6839813 | Chauvel | Jan 2005 | B2 |
6859208 | White | Feb 2005 | B1 |
6877077 | McGee et al. | Apr 2005 | B2 |
6883079 | Priborsky | Apr 2005 | B1 |
6950925 | Sander et al. | Sep 2005 | B1 |
6978462 | Adler et al. | Dec 2005 | B1 |
6981083 | Arimilli et al. | Dec 2005 | B2 |
7007075 | Coffey | Feb 2006 | B1 |
7010648 | Kadambi et al. | Mar 2006 | B2 |
7062631 | Klaiber et al. | Jun 2006 | B1 |
7082508 | Khan et al. | Jul 2006 | B2 |
7107411 | Burton et al. | Sep 2006 | B2 |
7107441 | Zimmer et al. | Sep 2006 | B2 |
7117330 | Alverson et al. | Oct 2006 | B1 |
7120715 | Chauvel et al. | Oct 2006 | B2 |
7124327 | Bennett et al. | Oct 2006 | B2 |
7139876 | Hooker | Nov 2006 | B2 |
7159095 | Dale et al. | Jan 2007 | B2 |
7162612 | Henry et al. | Jan 2007 | B2 |
7191349 | Kaushik et al. | Mar 2007 | B2 |
7194597 | Willis et al. | Mar 2007 | B2 |
7194604 | Bigelow et al. | Mar 2007 | B2 |
7203932 | Gaudet et al. | Apr 2007 | B1 |
7225355 | Yamazaki et al. | May 2007 | B2 |
7234038 | Durrant | Jun 2007 | B1 |
7275246 | Yates, Jr. et al. | Sep 2007 | B1 |
7310722 | Moy et al. | Dec 2007 | B2 |
7340582 | Madukkarumukumana et al. | Mar 2008 | B2 |
7340628 | Pessolano | Mar 2008 | B2 |
7401358 | Christie et al. | Jul 2008 | B1 |
7406585 | Rose et al. | Jul 2008 | B2 |
7447869 | Kruger et al. | Nov 2008 | B2 |
7519781 | Wilt | Apr 2009 | B1 |
7545382 | Montrym et al. | Jun 2009 | B1 |
7685365 | Rajwar | Mar 2010 | B2 |
7702843 | Chen et al. | Apr 2010 | B1 |
7730489 | Duvur et al. | Jun 2010 | B1 |
7752627 | Jones et al. | Jul 2010 | B2 |
7873793 | Rozas et al. | Jan 2011 | B1 |
7890735 | Tran | Feb 2011 | B2 |
7921300 | Crispin et al. | Apr 2011 | B2 |
7925923 | Hyser et al. | Apr 2011 | B1 |
8035648 | Wloka et al. | Oct 2011 | B1 |
8190863 | Fossum et al. | May 2012 | B2 |
8364902 | Hooker et al. | Jan 2013 | B2 |
8533437 | Henry et al. | Sep 2013 | B2 |
8549504 | Breternitz, Jr. et al. | Oct 2013 | B2 |
8621120 | Bender et al. | Dec 2013 | B2 |
8706975 | Glasco et al. | Apr 2014 | B1 |
8707011 | Glasco et al. | Apr 2014 | B1 |
8762127 | Winkel et al. | Jun 2014 | B2 |
9384001 | Hertzberg et al. | Jul 2016 | B2 |
9547602 | Klaiber et al. | Jan 2017 | B2 |
20010049818 | Banerjia et al. | Dec 2001 | A1 |
20020004823 | Anderson et al. | Jan 2002 | A1 |
20020013889 | Schuster et al. | Jan 2002 | A1 |
20020099930 | Sakamoto et al. | Jul 2002 | A1 |
20020108103 | Nevill | Aug 2002 | A1 |
20020169938 | Scott et al. | Nov 2002 | A1 |
20020172199 | Scott et al. | Nov 2002 | A1 |
20030014609 | Kissell | Jan 2003 | A1 |
20030018685 | Kalafatis et al. | Jan 2003 | A1 |
20030033507 | McGrath | Feb 2003 | A1 |
20030120892 | Hum et al. | Jun 2003 | A1 |
20030140245 | Dahan et al. | Jul 2003 | A1 |
20030167420 | Parsons | Sep 2003 | A1 |
20030172220 | Hao | Sep 2003 | A1 |
20030196066 | Mathews | Oct 2003 | A1 |
20030236771 | Becker | Dec 2003 | A1 |
20040025161 | Chauvel et al. | Feb 2004 | A1 |
20040054833 | Seal et al. | Mar 2004 | A1 |
20040078778 | Leymann et al. | Apr 2004 | A1 |
20040122800 | Nair et al. | Jun 2004 | A1 |
20040128448 | Stark et al. | Jul 2004 | A1 |
20040153350 | Kim et al. | Aug 2004 | A1 |
20040168042 | Lin | Aug 2004 | A1 |
20040193831 | Moyer | Sep 2004 | A1 |
20040215918 | Jacobs et al. | Oct 2004 | A1 |
20040225869 | Pagni et al. | Nov 2004 | A1 |
20040268071 | Khan et al. | Dec 2004 | A1 |
20050050013 | Ferlitsch | Mar 2005 | A1 |
20050055533 | Kadambi et al. | Mar 2005 | A1 |
20050086650 | Yates, Jr. et al. | Apr 2005 | A1 |
20050097276 | Lu et al. | May 2005 | A1 |
20050097280 | Hofstee et al. | May 2005 | A1 |
20050138332 | Kottapalli et al. | Jun 2005 | A1 |
20050154831 | Steely, Jr. et al. | Jul 2005 | A1 |
20050154867 | DeWitt, Jr. et al. | Jul 2005 | A1 |
20050207257 | Skidmore | Sep 2005 | A1 |
20050268067 | Lee et al. | Dec 2005 | A1 |
20060004984 | Morris et al. | Jan 2006 | A1 |
20060010309 | Chaudhry et al. | Jan 2006 | A1 |
20060069879 | Inoue et al. | Mar 2006 | A1 |
20060069899 | Schoinas et al. | Mar 2006 | A1 |
20060095678 | Bigelow et al. | May 2006 | A1 |
20060149931 | Haitham et al. | Jul 2006 | A1 |
20060174228 | Radhakrishnan et al. | Aug 2006 | A1 |
20060187945 | Andersen | Aug 2006 | A1 |
20060190671 | Jeddeloh | Aug 2006 | A1 |
20060195683 | Kissell | Aug 2006 | A1 |
20060230223 | Kruger et al. | Oct 2006 | A1 |
20060236074 | Williamson et al. | Oct 2006 | A1 |
20060259732 | Traut et al. | Nov 2006 | A1 |
20060259744 | Matthes | Nov 2006 | A1 |
20060259825 | Cruickshank et al. | Nov 2006 | A1 |
20060277398 | Akkary et al. | Dec 2006 | A1 |
20060282645 | Tsien | Dec 2006 | A1 |
20060288174 | Nace et al. | Dec 2006 | A1 |
20070067505 | Kaniyur et al. | Mar 2007 | A1 |
20070073996 | Kruger et al. | Mar 2007 | A1 |
20070106874 | Pan et al. | May 2007 | A1 |
20070126756 | Glasco et al. | Jun 2007 | A1 |
20070157001 | Ritzau | Jul 2007 | A1 |
20070168634 | Morishita et al. | Jul 2007 | A1 |
20070168643 | Hummel et al. | Jul 2007 | A1 |
20070204137 | Tran | Aug 2007 | A1 |
20070234358 | Hattori et al. | Oct 2007 | A1 |
20070240141 | Qin et al. | Oct 2007 | A1 |
20080141011 | Zhang et al. | Jun 2008 | A1 |
20080172657 | Bensal et al. | Jul 2008 | A1 |
20080263284 | da Silva et al. | Oct 2008 | A1 |
20090019317 | Quach et al. | Jan 2009 | A1 |
20090204785 | Yates, Jr. et al. | Aug 2009 | A1 |
20090327661 | Sperber et al. | Dec 2009 | A1 |
20090327673 | Yoshimatsu | Dec 2009 | A1 |
20100017183 | Kenney | Jan 2010 | A1 |
20100161901 | Williamson et al. | Jun 2010 | A9 |
20100205402 | Henry et al. | Aug 2010 | A1 |
20100205415 | Henry et al. | Aug 2010 | A1 |
20100217936 | Carmichael et al. | Aug 2010 | A1 |
20100306503 | Henry et al. | Dec 2010 | A1 |
20110078425 | Shah et al. | Mar 2011 | A1 |
20110153307 | Winkel | Jun 2011 | A1 |
20110238923 | Hooker et al. | Sep 2011 | A1 |
20110307876 | Ottoni | Dec 2011 | A1 |
20120023359 | Edmeades et al. | Jan 2012 | A1 |
20120089819 | Chaudhry et al. | Apr 2012 | A1 |
20120198157 | Abdallah | Aug 2012 | A1 |
20130198458 | Winkel et al. | Aug 2013 | A1 |
20130219370 | Beale et al. | Aug 2013 | A1 |
20130246709 | Segelken et al. | Sep 2013 | A1 |
20130275684 | Tuck et al. | Oct 2013 | A1 |
20130311752 | Brauch et al. | Nov 2013 | A1 |
20140019723 | Yamada et al. | Jan 2014 | A1 |
20140052962 | Hertzberg et al. | Feb 2014 | A1 |
20140082291 | Van Zoeren et al. | Mar 2014 | A1 |
20140136891 | Holmer et al. | May 2014 | A1 |
20140189310 | Tuck et al. | Jul 2014 | A1 |
20140281259 | Klaiber et al. | Sep 2014 | A1 |
20150026443 | Kumar et al. | Jan 2015 | A1 |
Number | Date | Country |
---|---|---|
1390329 | Jan 2003 | CN |
1519728 | Aug 2004 | CN |
1629799 | Jun 2005 | CN |
1682181 | Oct 2005 | CN |
101042670 | Sep 2007 | CN |
101110074 | Jan 2008 | CN |
100378618 | Apr 2008 | CN |
101984403 | Mar 2011 | CN |
102110011 | Jun 2011 | CN |
0671718 | Sep 1995 | EP |
1557754 | Jul 2005 | EP |
2287111 | Sep 1995 | GB |
2404043 | Jan 2005 | GB |
2404044 | Jan 2005 | GB |
02288927 | Nov 1990 | JP |
03054660 | Mar 1991 | JP |
04182858 | Jun 1992 | JP |
I232372 | May 2005 | TW |
I309378 | May 2009 | TW |
I315846 | Oct 2009 | TW |
201220183 | May 2012 | TW |
I425418 | Feb 2014 | TW |
2012103209 | Aug 2012 | WO |
Entry |
---|
Rozas, Guillermo J. et al., “Queued Instruction Re-Dispatch After Runahead,” U.S. Appl. No. 13/730,407, filed Dec. 28, 2012, 36 pages. |
Adve, S. et al., “Shared Memory Consistency models: A Turorial”, WRL Research Report 95/7, Western Digital Laboratory, Sep. 1995, 32 pages. |
Chaudhuri, “The impact of NACKs in shared memory scientific applications”, Feb. 2004, IEEE, IEEE Transactions on Parallel and distributed systems vol. 15, No. 2, p. 134-150. |
Chaudry, S. et al., “High-Performance Throughput Computing,” Micro, IEEE 25.3, pp. 32-45, May, 2005, 14 pages. |
Dehnert et al., “The Transmeta Code MorphingTM Software: Using Speculation, Recovery, and Adaptive Retranslation to Address Real-Life Challenges,” Mar. 23, 2003, IEEE, CGO '03 Proceedings of the International Symposium on Code generation and optimization: feedback-directed and runtime optimization, pp. 15-24. |
Dundas, J. et al., “Improving Date Cache Performance by Pre-executing Instructions Under a Cache Miss”, Proceedings of the 1997 International Conference on Supercomputing, Jul. 1997, 9 pages. |
Ekman, M. et al., “Instruction Categorization for Runahead Operation”, U.S. Appl. No. 13/708,544, filed Dec. 7, 2012, 32 Pages. |
Ekman, M. et al., “Selective Poisoning of Data During Runahead”, U.S. Appl. No. 13/662,171, filed Oct. 26, 2012, 33 pages. |
Guelfi et al., (Rapid Integration of Software Engineering Techniques) 2005, Second International Workshop, 9 pages. |
Harper et al., (Rapid recovery from transient Faults n The fault tolerant processor with fault-tolerant shared memory) 1990, IEEE, p. 350-359. |
Holmer, B., et al., “Managing Potentially Invalid Results During Runahead”, U.S. Appl. No. 13/677,085, filed Nov. 14, 2012, 29 pages. |
Intel Itanium Architecture Software Developer's Manual, Intel, http://www.intel.com/design/itanium/manuals/iiasdmanual.htm, 1 page. |
Laibinis, “Formal Development of Reactive Fault Tolerant Systems”, Sep. 9, 2005, Springer, Second International Workshop, RISE 2005, p. 234-249. |
Mutlu, O. et al. “Runahead Execution: An Alternative to Very large Instruction Windows for Out-of-order Processors,” This paper appears in: “High-Performance Computer Architecture,” Feb. 8-12, 2003, 13 pages. |
Wikipedia, Physical Address, Apr. 17, 2010, pp. 1-2, www.wikipedia.com. |
Ooi, (Fault Tolerant Architecture in a cache memory control LSI), 1992, IEEE, 507-514. |
Oracle, (Oracle 8i Parallel server), 1999, Oracle, Release 2 (8.1.6) 1-216. |
Osronline, (The Basics: So what is a Page fault?), http://www.osronline.com/article.cfm?article=222, May 5, 2003, p. 1-2. |
PC Magazine (Definition of: Page Fault) PCMag.com, Mar. 27, 2009. |
Rotenberg et al., “A Trace Cache Microarchitecture and Evaluation,” IEEE Transactions on Computers, vol. 48, No. 2, Feb. 1999, 10 pages. |
Rotenberg et al., “Trace Cache: a Low Latency Approach to High Bandwidth Instruction Fetching,” Proceedings of th 29th Annual International Symposium on Microarchitecture, Dec. 2-4, 1996, Paris, France, IEEE, 12 pages. |
Rotenberg et al., “Trace Cache: a Low Latency Approach to High Bandwidth Instruction Fetching,” <http://people.engr.ncsu.edu/ericro/publications/techreport_MICRO-29_rbs.pdf>, Apr. 11, 1996, 48 pages. |
Rozas, J. et al., “Lazy Runahead Operation for a Microprocessor”, U.S. Appl. No. 13/708,645, filed Dec. 7, 2012, 32 pages. |
Shalan, (Dynamic Memory Management for embedded real-time multiprocessor system on a chip), 2000, ACM, 180-186. |
Shalan, (Dynamic Memory Management for embedded real-time multiprocessor system on a chip), 2003, Georgia Inst. of Tech. 1-118. |
Wikipedia article, “Instruction Prefetch,” https://en.wikipedia.org/wiki/Instruction_prefetch, downloaded May 23, 2016. |
Wikipedia article, “x86,” https://en.wikipedia.org/wiki/X86, downloaded May 23, 2016. |
Wikipedia, (CPU Cache definition), Wikipedia, Jan. 26, 2010, pp. 1-16. |
Wikipedia, (Page Fault definition), Wikipedia, Mar. 9, 2009, pp. 1-4. |
Wikipedia, Memory Address, Oct. 29, 2010, pp. 1-4, www.wikipedia.com. |
Number | Date | Country | |
---|---|---|---|
20140281392 A1 | Sep 2014 | US |