Architectural-level instructions for microprocessors may be translated between an instruction set architecture (ISA) and a native architecture. In some microprocessors, software optimizations of the ISA instructions may execute comparatively more efficiently than the ISA instructions upon which those software optimizations were based. Some past approaches chained software optimizations to pass control from one software optimization to another. However, such approaches may be challenged by indirectly-branched processes because it may be difficult to determine the target of an indirect branch.
In modern microprocessors, architectural-level instructions may be translated between a source instruction set architecture (ISA), such as an advanced RISC machine (ARM) architecture or an x86 architecture, and an alternate ISA that achieves the same observable functionality as the source. For example, a set of one or more instructions of a source ISA may be translated into one or more micro-operations of a native architecture that perform the same function as the source ISA instruction. In some settings, the native micro-operation(s) may provide enhanced or optimized performance relative to the source ISA instruction.
Some past approaches attempted to chain software optimizations of source instructions so that control passed from one software optimization to another software optimization via direct native branches. However, such approaches may be challenged by branched processes. Because the branch source may be dynamic during program execution, chain-wise handoff between software optimizations may not be feasible. For example, should an indirect branch occur, the indeterminate target of the branch may make it difficult to ascertain which software optimization should be retrieved at the time the optimization is created. Consequently, the microprocessor may stall while the branch and software optimization for that branch are determined from potentially thousands of candidate optimizations.
Accordingly, various embodiments are disclosed herein that are related to fetching source information and alternate versions of the source information that achieve the same observable functionality (referred to herein as the same functionality) of the source information within an acceptable tolerance (e.g., within an acceptable tolerance of architecturally observable effect). It will be appreciated that virtually any suitable source information and any alternate version thereof may be employed without departing from the scope of the present disclosure. In some embodiments, a source may include an instruction, such as an instruction for an ISA architecture. In addition to or instead of instructions, the source information may include source data, and the alternate version may include an alternative form or version of the source data. Likewise, it will be appreciated that any suitable manner of transforming a source into an alternate version thereof (e.g., a software approach and/or a hardware approach) may be contemplated as being within the scope of the present disclosure. For illustrative purposes, the descriptions and figures presented herein refer to source instructions and translations of the source instructions, respectively, as source information and alternate versions of the source information, though such embodiments are not limiting.
One example method includes, upon being directed to retrieve an instruction, hashing an address for that instruction so that it may be determined if there exists an alternate version for that instruction. The hashing is performed to determine whether there exists an alternate version of the instruction which achieves the same functionality, such as a native translation (e.g., translations between a source instruction set architecture and a native micro-operation set architecture for various instructions that may be fetched for execution by the microprocessor). The example method further includes, if hashing results in a determination that such an alternate version exists, aborting retrieving of the instruction and retrieving and executing the alternate version.
The discussion herein will frequently refer to “retrieving” an instruction and then aborting that retrieval if certain conditions exist. In some embodiments, “retrieving” an instruction may include fetching an instruction. Further, when such aborting occurs, the retrieval process is terminated. The termination typically occurs prior to completion of the retrieval process. For example, in one scenario, aborting retrieval may occur while the physical address for an instruction is being retrieved. In another scenario, aborting retrieval may occur after the physical address for an instruction is retrieved but before the instruction is retrieved from memory. Aborting retrieval prior to completion of the retrieval process may save time spent accessing and retrieving the source from memory. It will be appreciated that, as used herein, retrieval is not limited to fetch scenarios, where fetch is typically completed prior to decode. For example, an instruction may be retrieved but aborted during decode, before decode, or at any suitable point.
A wide range of possibilities exist for mapping and translating between source information and translated versions of that information. By determining whether the alternate version exists and aborting retrieving the instruction, for example, an ISA instruction, if the alternate version does exist, the microprocessor may offer enhanced performance relative to microprocessors that decode source ISA instructions by avoiding decode operations. Additional performance enhancement may be realized in settings where the alternate version provides optimized performance by changes to the operations which allow the alternate version to proceed through execution more quickly than the source ISA instruction.
A memory controller 110H may be used to handle the protocol and provide the signal interface required of main memory 110D and to schedule memory accesses. Memory controller 110H can be implemented on the processor die or on a separate die. It is to be understood that the memory hierarchy provided above is non-limiting and other memory hierarchies may be used without departing from the scope of this disclosure.
Microprocessor 100 also includes a pipeline, illustrated in simplified form in
As shown in
Instruction translation lookaside buffer 122 may perform virtually any suitable manner of translating linear addresses into physical addresses for those instructions. For example, in some embodiments, instruction translation lookaside buffer 122 may include content-addressable memory that stores a portion of a page table that maps linear addresses for instructions to physical addresses for those instructions.
Fetch logic 120 also determines whether a native translation for the selected instruction exists. If such a native translation exists, the system aborts the instruction fetch and sends the native translation for execution instead. In the embodiment depicted in
Almost any suitable data storage architecture and logic may be used for translation address cache 124. For example,
Continuing with
The embodiment depicted in
Pipeline 102 may also include mem logic 138 for performing load and/or store operations and writeback logic 140 for writing the result of operations to an appropriate location such as register 109. Upon writeback, the microprocessor enters a state modified by the instruction or instructions, so that the result of the operations leading to the committed state may not be undone.
It should be understood that the above stages shown in pipeline 102 are illustrative of a typical RISC implementation, and are not meant to be limiting. For example, in some embodiments, VLIW-techniques may be implemented upstream of certain pipelined stages. In some other embodiments, the scheduling logic may be included in the fetch logic and/or the decode logic of the microprocessor. More generally a microprocessor may include fetch, decode, and execution logic, with mem and write back functionality being carried out by the execution logic. The present disclosure is equally applicable to these and other microprocessor implementations.
In the described examples, instructions may be fetched and executed one at a time or more than one at a time, possibly requiring multiple clock cycles. During this time, significant parts of the data path may be unused. In addition to or instead of single instruction fetching, pre-fetch methods may be used to improve performance and avoid latency bottlenecks associated with read and store operations (i.e., the reading of instructions and loading such instructions into processor registers and/or execution queues). Accordingly, it will be appreciated that virtually any suitable manner of fetching, scheduling, and dispatching instructions may be used without departing from the scope of the present disclosure.
Turning to
In some embodiments, fetching the selected instruction may include fetching a physical address for the selected instruction from an instruction translation lookaside buffer. In such embodiments, a linear address for the selected instruction may be received upon direction to the target instruction pointer. In turn, the linear address may be translated into a physical address for the selected instruction by the instruction translation lookaside buffer by searching, with reference to the linear address, physical addresses stored in the instruction lookaside buffer. If the search does not hit upon the physical address for the selected instruction, the physical address may be determined via a page walk or via lookup in a higher-level translation lookaside buffer. Regardless of how the physical address is determined, once the physical address for the selected instruction is determined, it is provided to an instruction cache so that the selected instruction may be obtained.
At 304, method 300 comprises hashing the linear address for the selected instruction to generate a hash index from the linear address while the physical address for the selected instruction is being obtained. The hash index may then be used when determining whether a native translation for the selected instruction exists, as described in more detail below.
For example, direction to the target instruction pointer may cause the linear address to be hashed concurrently (within a suitable tolerance) with distribution of the linear address to an instruction translation lookaside buffer. However, it will be appreciated that any suitable manner of performing the hash may be employed at any suitable position within the process flow without departing from the scope of the present disclosure.
In some embodiments, the linear address may be hashed by a suitable hardware structure included in the microprocessor. For example, the linear address may be hashed by the fetch logic and/or the native translation address cache, though virtually any suitable hardware structure may be used to hash the linear address without departing from the scope of the present disclosure.
A wide variety of hash techniques may be employed. For example, in some embodiments, the hash index may be generated using an XOR hash function. A hash index can also be generated by hashing a plurality of portions of the linear address. In some other embodiments, a hash index may be generated by using a single portion of the linear address.
In some embodiments, a disambiguation tag may be generated when the linear address is hashed. The disambiguation tag may be used to discriminate various translation address entries for alternate versions (for example, address entries for native translations of instructions) from one another when more than one translation address entry in the translation address cache has the same index value. Thus, in some embodiments, the disambiguation tag may be used to disambiguate a plurality of translation address entries having identical translation address indices stored in the translation address cache. For example,
While the discussion above relates to hashing a linear address to obtain one or more translation address entries from a translation address cache, so that the translation address entries are indexed according to linear addresses, it will be appreciated that the translation address cache may be indexed according to any suitable address. For example, in some embodiments, a suitably-configured translation address cache may be indexed according to physical addresses. Indexing a translation address cache according to physical addresses may save space within the translation address cache when two processes map to a shared library at different linear addresses. In some of such scenarios, only one version of the shared library may be physically loaded into memory. By indexing according to a physical address, a shared mapping may lead to a single entry being obtained, while an unshared mapping may lead to different entries being obtained.
Turning to
Regardless of when the validity determination is performed, if it is determined that a valid native translation exists, fetching the source instruction may be aborted, by aborting retrieval of the physical address for the source instruction, for example. In turn, processing efficiency may be enhanced by avoiding decode steps and by permitting use of the alternate version.
In the embodiment shown in
A translation address entry stores a physical address where a native translation is stored. Translation address entries may be looked up according to a translation address index associated therewith. For example, a hash index generated when hashing an address may be used to look up a particular translation address index in a translation address cache.
In some embodiments, more than one translation address entry may be obtained via lookup of a particular translation address index. For example, a hashed address used to look up a translation address index for a 4-way associative cache may result in the retrieval of up to four translation address entries. In such embodiments, each translation address entry has a respective translation address disambiguation tag that disambiguates that entry from other entries having identical translation address indices. Comparing the disambiguation tag generated by hashing the address with disambiguation tags retrieved with respective translation address entries may determine whether any of the entries obtained represents a physical address for a valid native translation. In some embodiments, comparison of the disambiguation tags may include a comparison of a valid bit. In such embodiments, agreement between tags being compared may be found only if the valid bit is set to a preselected value, such as a value of 1.
In some embodiments, a translation address entry may include bits representative of the physical address for a native translation and bits representative of an assumed context for the native translation. Additionally, in some embodiments, a translation address entry may include one or more other bits related to the translation and/or aspects of the translation.
Continuing with
As introduced above, in some embodiments, a translation address entry may include an assumed context for the native translation. As used herein, a current context describes a current working state of the microprocessor and an assumed context describes a state of the microprocessor for which the native translation is valid. Thus, in some embodiments, even if a valid disambiguation tag for an entry is identified, the entry associated with that disambiguation tag may not include a valid native translation for the current context. In some examples, issuing a native translation for which the current context and assumed context do not agree may cause an execution error or hazard.
It will be appreciated that the context may be included in any suitable part of the translation address entry and/or the translation address. In the example shown in
Additionally or alternatively, in some embodiments, bits for the assumed context may be included in the translation address, such as in the disambiguation tag and/or the hash. In such embodiments, inclusion of the assumed context in one or more parts of the address may allow concurrent storage of two or more entries with different contexts and otherwise identical linear addresses within the translation address cache. It will be appreciated that implementation of such embodiments may depend upon application-specific considerations. For example, in some embodiments where set associativity is low, such as in a scenario where the addresses are directly mapped, the assumed context may be included in the hash may avoid a conflict miss. For example, the assumed context may be XOR'ed into the hash during hashing. In some other embodiments, such as those where a cycle time for hashing additional bits affects processing time more than a time for processing a comparatively wider disambiguation tag, the assumed context may be added to the disambiguation tag to avoid potential processing delays. As an example, the assumed context may be appended to the disambiguation tag. In still other embodiments, the assumed context may be included in the hash and in the disambiguation tag.
Once it is determined that a valid native translation exists, method 300 comprises, at 320, aborting fetching the instruction. When aborting occurs, the fetch process is terminated. While the termination may occur after fetch of the instruction, in some embodiments the termination may occur prior to completion of the fetch process. For example, in embodiments where fetching the instruction includes retrieving the physical address for an instruction from an instruction translation lookaside buffer, aborting fetching the instruction may include aborting retrieving the physical address from the instruction translation lookaside buffer.
At 322, method 300 includes sending the physical address for the native translation to the instruction cache, and, at 324, receiving the selected native translation from the instruction cache. In some embodiments, once the selected native translation is received from the instruction cache, it may be forwarded to a native translation buffer in preparation for eventual distribution to scheduling logic where it is to be scheduled for execution.
Alternatively, in the embodiment shown in
Consequently, by determining the existence of alternate versions for the source material, (in the examples described above, native translations that provide the same functionality as the source instructions) while fetching the source material, the methods described herein may offer enhanced processing relative to processing based on the source material alone. Further, by utilizing hardware structures to perform the concurrent determination, the methods described herein may be comparatively more efficient relative to software optimization-based schemes, particularly in branched processing scenarios.
This written description uses examples to disclose the invention, including the best mode, and also to enable a person of ordinary skill in the relevant art to practice the invention, including making and using any devices or systems and performing any incorporated methods. The patentable scope of the invention is defined by the claims, and may include other examples as understood by those of ordinary skill in the art. Such other examples are intended to be within the scope of the claims.
Number | Name | Date | Kind |
---|---|---|---|
3815101 | Boss et al. | Jun 1974 | A |
3950729 | Fletcher et al. | Apr 1976 | A |
4654790 | Woffinden | Mar 1987 | A |
4797814 | Brenza | Jan 1989 | A |
4812981 | Chan et al. | Mar 1989 | A |
5123094 | MacDougall | Jun 1992 | A |
5179669 | Peters | Jan 1993 | A |
5245702 | McIntyre et al. | Sep 1993 | A |
5278962 | Masuda et al. | Jan 1994 | A |
5414824 | Grochowski | May 1995 | A |
5446854 | Khalidi et al. | Aug 1995 | A |
5487146 | Guttag et al. | Jan 1996 | A |
5526504 | Hsu et al. | Jun 1996 | A |
5649102 | Yamauchi et al. | Jul 1997 | A |
5649184 | Hayashi et al. | Jul 1997 | A |
5696925 | Koh | Dec 1997 | A |
5721855 | Hinton | Feb 1998 | A |
5870582 | Cheong et al. | Feb 1999 | A |
5949785 | Beasley | Sep 1999 | A |
5956753 | Glew et al. | Sep 1999 | A |
5963984 | Garibay, Jr. et al. | Oct 1999 | A |
5999189 | Kajiya et al. | Dec 1999 | A |
6012132 | Yamada et al. | Jan 2000 | A |
6031992 | Cmelik et al. | Feb 2000 | A |
6091897 | Yates et al. | Jul 2000 | A |
6091987 | Thompson | Jul 2000 | A |
6118724 | Higginbottom | Sep 2000 | A |
6297832 | Mizuyabu et al. | Oct 2001 | B1 |
6298390 | Matena et al. | Oct 2001 | B1 |
6362826 | Doyle et al. | Mar 2002 | B1 |
6457115 | McGrath | Sep 2002 | B1 |
6470428 | Milway et al. | Oct 2002 | B1 |
6499090 | Hill et al. | Dec 2002 | B1 |
6519694 | Harris | Feb 2003 | B2 |
6549997 | Kalyanasundharam | Apr 2003 | B2 |
6636223 | Morein | Oct 2003 | B1 |
6658538 | Arimilli et al. | Dec 2003 | B2 |
6711667 | Ireton | Mar 2004 | B1 |
6714904 | Torvalds et al. | Mar 2004 | B1 |
6742104 | Chauvel et al. | May 2004 | B2 |
6751583 | Clarke et al. | Jun 2004 | B1 |
6813699 | Belgard | Nov 2004 | B1 |
6823433 | Barnes et al. | Nov 2004 | B1 |
6839813 | Chauvel | Jan 2005 | B2 |
6859208 | White | Feb 2005 | B1 |
6877077 | McGee et al. | Apr 2005 | B2 |
6883079 | Priborsky | Apr 2005 | B1 |
6950925 | Sander et al. | Sep 2005 | B1 |
6978462 | Adler et al. | Dec 2005 | B1 |
6981083 | Arimilli et al. | Dec 2005 | B2 |
7007075 | Coffey | Feb 2006 | B1 |
7010648 | Kadambi et al. | Mar 2006 | B2 |
7062631 | Klaiber et al. | Jun 2006 | B1 |
7082508 | Khan et al. | Jul 2006 | B2 |
7107411 | Burton et al. | Sep 2006 | B2 |
7107441 | Zimmer et al. | Sep 2006 | B2 |
7117330 | Alverson et al. | Oct 2006 | B1 |
7120715 | Chauvel et al. | Oct 2006 | B2 |
7124327 | Bennett et al. | Oct 2006 | B2 |
7139876 | Hooker | Nov 2006 | B2 |
7159095 | Dale et al. | Jan 2007 | B2 |
7162612 | Henry et al. | Jan 2007 | B2 |
7191349 | Kaushik et al. | Mar 2007 | B2 |
7194597 | Willis et al. | Mar 2007 | B2 |
7194604 | Bigelow et al. | Mar 2007 | B2 |
7203932 | Gaudet et al. | Apr 2007 | B1 |
7225355 | Yamazaki et al. | May 2007 | B2 |
7234038 | Durrant | Jun 2007 | B1 |
7275246 | Yates, Jr. et al. | Sep 2007 | B1 |
7310722 | Moy et al. | Dec 2007 | B2 |
7340582 | Madukkarumukumana et al. | Mar 2008 | B2 |
7340628 | Pessolano | Mar 2008 | B2 |
7401358 | Christie et al. | Jul 2008 | B1 |
7406585 | Rose et al. | Jul 2008 | B2 |
7447869 | Kruger et al. | Nov 2008 | B2 |
7519781 | Wilt | Apr 2009 | B1 |
7545382 | Montrym et al. | Jun 2009 | B1 |
7702843 | Chen et al. | Apr 2010 | B1 |
7730489 | Duvur et al. | Jun 2010 | B1 |
7752627 | Jones et al. | Jul 2010 | B2 |
7873793 | Rozas et al. | Jan 2011 | B1 |
7890735 | Tran | Feb 2011 | B2 |
7921300 | Crispin et al. | Apr 2011 | B2 |
7925923 | Hyser et al. | Apr 2011 | B1 |
8035648 | Wloka et al. | Oct 2011 | B1 |
8190863 | Fossum et al. | May 2012 | B2 |
8364902 | Hooker et al. | Jan 2013 | B2 |
8533437 | Henry et al. | Sep 2013 | B2 |
8549504 | Breternitz, Jr. et al. | Oct 2013 | B2 |
8621120 | Bender et al. | Dec 2013 | B2 |
8706975 | Glasco et al. | Apr 2014 | B1 |
8707011 | Glasco et al. | Apr 2014 | B1 |
8762127 | Winkel et al. | Jun 2014 | B2 |
9384001 | Hertzberg et al. | Jul 2016 | B2 |
9547602 | Klaiber et al. | Jan 2017 | B2 |
20010049818 | Banerjia et al. | Dec 2001 | A1 |
20020004823 | Anderson et al. | Jan 2002 | A1 |
20020013889 | Schuster et al. | Jan 2002 | A1 |
20020099930 | Sakamoto et al. | Jul 2002 | A1 |
20020108103 | Nevill | Aug 2002 | A1 |
20020169938 | Scott et al. | Nov 2002 | A1 |
20020172199 | Scott et al. | Nov 2002 | A1 |
20030014609 | Kissell | Jan 2003 | A1 |
20030018685 | Kalafatis et al. | Jan 2003 | A1 |
20030033507 | McGrath | Feb 2003 | A1 |
20030120892 | Hum et al. | Jun 2003 | A1 |
20030140245 | Dahan et al. | Jul 2003 | A1 |
20030167420 | Parsons | Sep 2003 | A1 |
20030172220 | Hao | Sep 2003 | A1 |
20030196066 | Mathews | Oct 2003 | A1 |
20030236771 | Becker | Dec 2003 | A1 |
20040025161 | Chauvel et al. | Feb 2004 | A1 |
20040054833 | Seal et al. | Mar 2004 | A1 |
20040078778 | Leymann et al. | Apr 2004 | A1 |
20040122800 | Nair et al. | Jun 2004 | A1 |
20040128448 | Stark et al. | Jul 2004 | A1 |
20040153350 | Kim et al. | Aug 2004 | A1 |
20040168042 | Lin | Aug 2004 | A1 |
20040193831 | Moyer | Sep 2004 | A1 |
20040215918 | Jacobs et al. | Oct 2004 | A1 |
20040225869 | Pagni et al. | Nov 2004 | A1 |
20040268071 | Khan et al. | Dec 2004 | A1 |
20050050013 | Ferlitsch | Mar 2005 | A1 |
20050055533 | Kadambi et al. | Mar 2005 | A1 |
20050086650 | Yates, Jr. et al. | Apr 2005 | A1 |
20050097276 | Lu et al. | May 2005 | A1 |
20050097280 | Hofstee et al. | May 2005 | A1 |
20050138332 | Kottapalli et al. | Jun 2005 | A1 |
20050154831 | Steely, Jr. et al. | Jul 2005 | A1 |
20050154867 | DeWitt, Jr. et al. | Jul 2005 | A1 |
20050207257 | Skidmore | Sep 2005 | A1 |
20050268067 | Lee et al. | Dec 2005 | A1 |
20060004984 | Morris et al. | Jan 2006 | A1 |
20060010309 | Chaudhry et al. | Jan 2006 | A1 |
20060069879 | Inoue et al. | Mar 2006 | A1 |
20060069899 | Schoinas et al. | Mar 2006 | A1 |
20060095678 | Bigelow et al. | May 2006 | A1 |
20060149931 | Haitham et al. | Jul 2006 | A1 |
20060174228 | Radhakrishnan et al. | Aug 2006 | A1 |
20060187945 | Andersen | Aug 2006 | A1 |
20060190671 | Jeddeloh | Aug 2006 | A1 |
20060195683 | Kissell | Aug 2006 | A1 |
20060230223 | Kruger et al. | Oct 2006 | A1 |
20060236074 | Williamson | Oct 2006 | A1 |
20060259732 | Traut et al. | Nov 2006 | A1 |
20060259744 | Matthes | Nov 2006 | A1 |
20060259825 | Cruickshank et al. | Nov 2006 | A1 |
20060277398 | Akkary et al. | Dec 2006 | A1 |
20060282645 | Tsien | Dec 2006 | A1 |
20060288174 | Nace et al. | Dec 2006 | A1 |
20070067505 | Kaniyur et al. | Mar 2007 | A1 |
20070073996 | Kruger et al. | Mar 2007 | A1 |
20070106874 | Pan et al. | May 2007 | A1 |
20070126756 | Glasco et al. | Jun 2007 | A1 |
20070157001 | Ritzau | Jul 2007 | A1 |
20070168634 | Morishita et al. | Jul 2007 | A1 |
20070168643 | Hummel et al. | Jul 2007 | A1 |
20070204137 | Tran | Aug 2007 | A1 |
20070234358 | Hattori et al. | Oct 2007 | A1 |
20070240141 | Qin et al. | Oct 2007 | A1 |
20080141011 | Zhang et al. | Jun 2008 | A1 |
20080172657 | Bensal et al. | Jul 2008 | A1 |
20080263284 | da Silva et al. | Oct 2008 | A1 |
20090019317 | Quach et al. | Jan 2009 | A1 |
20090204785 | Yates, Jr. et al. | Aug 2009 | A1 |
20090327661 | Sperber et al. | Dec 2009 | A1 |
20090327673 | Yoshimatsu et al. | Dec 2009 | A1 |
20100161901 | Williamson | Jun 2010 | A9 |
20100205402 | Henry et al. | Aug 2010 | A1 |
20100205415 | Henry et al. | Aug 2010 | A1 |
20100217936 | Carmichael et al. | Aug 2010 | A1 |
20100306503 | Henry et al. | Dec 2010 | A1 |
20110078425 | Shah et al. | Mar 2011 | A1 |
20110153307 | Winkel et al. | Jun 2011 | A1 |
20110307876 | Ottoni et al. | Dec 2011 | A1 |
20120023359 | Edmeades et al. | Jan 2012 | A1 |
20120089819 | Chaudhry et al. | Apr 2012 | A1 |
20120198157 | Abdallah | Aug 2012 | A1 |
20130198458 | Winkel et al. | Aug 2013 | A1 |
20130219370 | Beale et al. | Aug 2013 | A1 |
20130275684 | Tuck et al. | Oct 2013 | A1 |
20130311752 | Brauch et al. | Nov 2013 | A1 |
20140019723 | Yamada et al. | Jan 2014 | A1 |
20140052962 | Hertzberg et al. | Feb 2014 | A1 |
20140082291 | Van Zoeren et al. | Mar 2014 | A1 |
20140136891 | Holmer et al. | May 2014 | A1 |
20140189310 | Tuck et al. | Jul 2014 | A1 |
20140281259 | Klaiber et al. | Sep 2014 | A1 |
20140281392 | Tuck et al. | Sep 2014 | A1 |
20150026443 | Kumar et al. | Jan 2015 | A1 |
Number | Date | Country |
---|---|---|
1390329 | Jan 2003 | CN |
1519728 | Aug 2004 | CN |
1629799 | Jun 2005 | CN |
1682181 | Oct 2005 | CN |
1823322 | Aug 2006 | CN |
1831757 | Sep 2006 | CN |
101042670 | Sep 2007 | CN |
101110074 | Jan 2008 | CN |
100378618 | Apr 2008 | CN |
102110011 | Jun 2011 | CN |
101984403 | Jun 2014 | CN |
102013218370 | Mar 2014 | DE |
0671718 | Sep 1995 | EP |
1557754 | Jul 2005 | EP |
2287111 | Sep 1995 | GB |
2404044 | Jan 2005 | GB |
02288927 | Nov 1990 | JP |
03054660 | Mar 1991 | JP |
04182858 | Jun 1992 | JP |
200401187 | Jan 2004 | TW |
I232372 | May 2005 | TW |
1263938 | Oct 2006 | TW |
1275938 | Mar 2007 | TW |
1282230 | Jun 2007 | TW |
200723111 | Jun 2007 | TW |
I284281 | Jul 2007 | TW |
200809514 | Feb 2008 | TW |
1309378 | May 2009 | TW |
1315488 | Oct 2009 | TW |
I315846 | Oct 2009 | TW |
201106264 | Feb 2011 | TW |
I275938 | Mar 2011 | TW |
201135460 | Oct 2011 | TW |
201220183 | May 2012 | TW |
2012103209 | Aug 2012 | WO |
Entry |
---|
Rotenberg, Eric et al., “Trace Cache: a Low Latency Approach to High Bandwidth Instruction Fetching,” Published in the Proceedings of the 29th Annual International Symposium on Microarchitecture, Dec. 2-4, 1996, Paris, France, 12 pages. |
Rotenberg, Eric et al., “A Trace Cache Microarchitecture and Evaluation,” IEEE Transactions on Computers, vol. 48, No. 2, Feb. 1999, 10 pages. |
Rotenberg, Eric et al., “Trace Cache: a Low Latency Approach to High Bandwidth Instruction Fetching,” <http://people.engr.ncsu.edu/ericro/publications/techreport_MICRO-29_rbs.pdf> Apr. 11, 1996, 48 pages. |
Harper et al., (Rapid recovery from transient Faults in the fault tolerant processor with fault-tolerant shared memory) 1990, IEEE, p. 350-359. |
Wikipedia, (Page Fault definition), Wikipedia, Mar. 9, 2009, pp. 1-4. |
Wikipedia, (CPU Cache definition), Wikipedia, Jan. 26, 2010, pp. 1-16. |
Osronline, (The Basics: So what is a Page fault?), http://www.osronline.com/article.cfm?article=222, May 5, 2003, p. 1-2. |
Chaudhuri, “The impact of NACKs in shared memory scientific applications”, Feb. 2004, IEEE, IEEE Transactions on Parallel and distributed systems vol. 15, No. 2, p. 134-150. |
Laibinis, “Formal Development of Reactive Fault Tolerant Systems”, Sep. 9, 2005, Springer, Second International Workshop, RISE 2005, p. 234-249. |
Wikipedia, Memory Address, Oct. 29, 2010, pp. 1-4, www.wikipedia.com. |
Wikipedia, Physical Address, Apr. 17, 2010, pp. 1-2, www.wikipedia.com. |
Guelfi et al., (Rapid Integration of Software Engineering Techniques) 2005, Second International Workshop, 9 pages. |
Ooi, (Fault Tolerant Architecture in a cache memory control LSI), 1992, IEEE, 507-514. |
Oracle, (Oracle 8i Parallel server), 1999, Oracle, Release 2 (8.1.6) 1-216. |
Shalan. (Dynamic Memory Management for embedded real-time multiprocessor system on a chip), 2000, ACM. 180-186. |
Shalan. (Dynamic Memory Management for embedded real-time multiprocessor system on a chip), 2003, Georgia Inst. of Tech. 1-118. |
PC Magazine (Definition of: Page Fault) PCMag.com, Mar. 27, 2009. |
Adve, S. et al., “Shared Memory Consistency models: A Turorial”, WRL Research Report 95/7, Western Digital Laboratory, Sep. 1995, 32 pages. |
Chaudry, S. et al., “High-Performance Throughput Computing,” Micro, IEEE 25.3, pp. 32-45, May 2005, 14 pages. |
Dehnert, et al., The Transmeta Code Morphing Software: using speculation, recovery, and adaptive retranslation to address real-life challenges, Mar. 23, 2003, IEEE, CGO '03 Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization, pp. 15-24. |
Dundas, J. et al., “Improving Date Cache Performance by Pre-executing Instructions Under a Cache Miss”, Proceedings of the 1997 International Conference on Supercomputing, Jul. 1997, 9 pages. |
Ekman, M. et al., “Instruction Categorization for Runahead Operation”, U.S. Appl. No. 13/708,544, filed Dec. 7, 2012, 32 Pages. |
Holmer, B., et al., “Managing Potentially Invalid Results During Runahead”, U.S. Appl. No. 13/677,085, filed Nov. 14, 2012, 29 pages. |
Mutlu, O. et al. “Runahead Execution: An Alternative to Very large Instruction Windows for Out-of-order Processors,” This paper appears in: “High-Performance Computer Architecture,” Feb. 8-12, 2003, 13 pages. |
Nvidia Corp. Akquirierung spekulativer Genehmigung jur gemeinsam genutzten Speicher, Mar. 20, 2014, SW102013218370 A1, German Patent Office, All Pages. |
Rozas, Guillermo J. et al., “Queued Instruction Re-Dispatch After Runahead,” U.S. Appl. No. 13/730,407, filed Dec. 28, 2012, 36 pages. |
Rozas, J. et al., “Lazy Runahead Operation for a Microprocessor”, U.S. Appl. No. 13/708,645, filed Dec. 7, 2012, 32 pages. |
Wikipedia article, “Instruction Prefetch,” https://en.wikipedia.org/wiki/Instruction_prefetch, downloaded May 23, 2016. |
Wikipedia article, “x86,” https://en.wikipedia.org/wiki/X86, downloaded May 23, 2016. |
Number | Date | Country | |
---|---|---|---|
20130246709 A1 | Sep 2013 | US |