Modern microprocessors may be implemented so that ISA instructions and their constituent components (e.g., micro-operations) are organized into transactions. The transactions have multiple sub-components that are executed by the processor. The individual transactions as a whole appear atomic and indivisible even if the sub-components are executed independently internally. Prior to commitment of a transaction, the sub-component operations can speculatively affect the cache subsystem (e.g., via a speculative store). Regardless of how transactional memory is handled, a multi-threaded processor increases the complexity of maintaining the coherency of the data cache because cache locations typically are shared by the processing threads. These processors must ensure that speculative data from one thread is not visible to another thread. Alternatively, if speculative data from one thread has become visible to another, a rollback on one thread requires a coordinated rollback of any other thread that has observed the speculative data.
The present disclosure is directed to a multithreaded micro-processing system in which the coherency and use of a data cache is maintained and controlled via state information associated with individual threads that read and write to locations in the data cache while carrying out transactions. Each location in the data cache (e.g., a cache line) has an associated global state. This global state specifies the coherency status of the cache location relative to a corresponding location in another data cache and/or relative to a shared memory resource backing the two caches, e.g. an L2 cache backing two different L1 caches on separate processing cores.
For example, in a MESI coherency regime, a cache location can be globally modified, exclusive, shared or invalid. In the modified state (M), the cache location is exclusive (not shared with a corresponding location in another cache) and dirty (contains data that is more recent and that is not replicated elsewhere in the memory system). Transitioning to the modified state requires invalidation of any shared cache locations on other commonly-backed cores. In the exclusive state (E), the cache location is not shared, but the data in the location is clean—i.e., duplicated in another location such as a higher level cache. As with the modified state, transitioning to exclusive involves invalidation of any shared locations on other commonly-backed cores. The shared state (S) reflects a situation in which one or more locations are valid, contain the same data, if there are multiple locations, and the data is clean, i.e., duplicated in another place in the memory system.
In the present system, the global state of each location in the cache is augmented with separate thread state information maintained for processing threads that interact with the data cache and its cache locations (e.g., cache lines). This thread state information is specified for the cache location separately from and in addition to the global state. A cache controller or other control mechanism uses this state information to individually control whether the threads may read from or write to the cache location during execution of a transaction. The thread state information may further include information about speculative activity of the individual threads, such as reads and writes that have occurred during in-flight, uncommitted transactions. The control mechanism can therefore further use the thread state information to determine and control whether uncommitted transactions of threads relating to the cache location are to be rolled back. Although the examples herein contemplate a MESI global state, it should be understood that other mechanisms and methods may be used to provide a global state without departing from the scope of the present description. Control of permissions and rollback may be performed directly by the cache controller, or in combination with other mechanisms/logic on the processor core.
System 100 may include a microprocessor/processor core 102 that includes and/or may communicate with various memory/storage locations, which may include processor registers 104, an L1 cache 106, an L2 cache 108, an L3 cache 110, main memory 112 (e.g., one or more DRAM chips), secondary storage 114 (e.g., magnetic and/or optical storage units) and/or tertiary storage 116 (e.g., a tape farm). Some or all of these locations may be memory-mapped, though in some implementations the processor registers may be mapped differently than the other locations, or may be implemented such that they are not memory-mapped. As indicated by the dashed box, some of the memory/storage locations may be on core 102, while others are off the core. In some examples, the L1-L3 caches will all be on the die/core; in other examples only the L1 cache will be on the die/core with the other caches and memory/storage locations being off the die/core. Many different configurations are possible. It will be understood that the memory/storage components are listed above in increasing order of access time and capacity, though there are possible exceptions. Typically, some form of memory controller (not shown) will be used to handle the protocol and provide the signal interface required of main memory 112, and, typically, to schedule memory accesses. The memory controller can be implemented on the processor die or on a separate die. It is to be understood that the locations set forth above are non-limiting and that other memory/storage locations may be used without departing from the scope of this disclosure.
Microprocessor 102 may be configured to execute various instruction set architectures, which may be characterized as complex instruction sets (CISC architecture), reduced instruction sets (RISC architecture), and/or VLIW architectures. Furthermore, it is possible that a given instruction set may have characteristics associated with more than one of these regimes. In addition, some instruction sets that are thought of as CISC implementations may in fact be executed on microprocessor 102 in a RISC-like fashion. For example, the widely employed x86 architecture, though considered a CISC system, is often implemented in a manner that is more associated with a pipelined RISC implementation.
Instantiation of code as a series of processor-recognized instructions (i.e., ISA instructions) may entail compiling code of an operating system, application, driver, etc. to produce binary code that is executed by microprocessor 102. Binary code may be optimized for execution using various techniques, including VLIW-type techniques, dynamic binary translation using a software layer, etc. In some cases, software optimizations are employed so that the microprocessor can execute instructions in program order without the need for the complex hazard detection and avoidance/mitigation hardware that are present in many CISC and RISC execution pipelines.
Microprocessor 100 further includes a processing pipeline 120 which typically includes fetch logic 122, decode logic 124, execution logic 126, mem logic 128, and writeback logic 130. Fetch logic 122 retrieves instructions from one or more of the depicted memory/storage locations (but typically from either unified or dedicated L1 caches backed by L2-L3 caches and main memory).
Decode logic 124 decodes instructions, for example, by parsing opcodes, operands, and addressing modes. Upon being parsed, the instructions are then executed by execution logic 126. For operations that produce a primary result (e.g., as opposed to those that perform a branch to another location in the executing program), writeback logic 130 writes the result to an appropriate location, such as a processor register. In load/store architectures, mem logic 128 performs load and store operations, such as loading an operand from main memory into a processor register.
It should be understood that the above five stages are somewhat specific to, and included in, a typical RISC implementation. More generally, a microprocessor may include fetch, decode, and execution logic, with mem and writeback functionality being carried out by the execution logic. The present disclosure is equally applicable to these and other microprocessor implementations.
In the described examples, instructions may be fetched and executed one at a time, possibly requiring multiple clock cycles. During this time, significant parts of the data path may be unused. In addition to, or instead of, single instruction fetching, pre-fetch methods may be used to improve performance and avoid latency bottlenecks associated with read and store operations (i.e., the reading of instructions and loading such instructions into processor registers and/or execution queues). In addition, the exemplary microprocessor may be pipelined to exploit instruction level parallelism and better utilize the data path so that there are multiple instructions in different stages of execution at the same time. Still further, fetch logic 122, decode logic 124 execution logic 126 etc., typically are individually pipelined with multiple logic stages to improve performance.
As indicated above, microprocessor 102 is implemented with multiple processing threads. These processing threads can concurrently make use of pipeline 120 and processor registers 104, and can use execution mechanisms (e.g., execution logic 126) to perform transactions. The transactions include multiple operations (e.g., micro-operations), but are treated atomically in the examples herein in an all-or-none fashion. In other words, between transaction boundaries, transaction operations are speculative. Commitment of the transaction promotes the architectural state of processor registers, memory system, etc. to a committed state. Alternatively, a transaction may be rolled back for various reasons, in which case the system reverts to the committed state that existed just prior to initiation of the transaction.
Data cache 204 is backed by a shared memory resource 220, which also backs another data cache 222, e.g., a cache on another core. Data caches 204, 222 and shared memory resource 220 may all contain copies of the same piece of data. One example among many is a situation where a cache line in data cache 204 corresponds to a cache line in data cache 222. A coherency regime is therefore needed to maintain consistency of data and identify the coherency status of locations in the data caches.
Cache controller 230 is operatively coupled with data cache 204, and maintains and enforces state information for each cache location 208 in the data cache. As will be described in more detail below, the cache controller maintains and controls, for each location in the cache, a global state for the cache location and thread state information for each processing thread interacting with the data cache. The global state specifies the coherency of the cache location relative to shared memory resource 220 and the other off-core data cache 222. The global state may identify, for example, whether a cache line is shared with a cache line on another core, whether the cache line is exclusive to a core, whether the data in the cache line matches a corresponding copy elsewhere in the memory hierarchy, etc. As indicated above, a MESI regime may be employed for the global state, but other coherency methods may be employed without departing from the spirit of the present disclosure.
The thread state information is specified in addition to, and is maintained separately from the global state. In the examples herein, the thread state information may be used to individually control whether each of the threads can read from and write to the cache location. In addition to this permission control, the thread state information can indicate whether a thread has speculatively read from or written to the cache location. The microprocessor can then use this information about speculative thread activity to control whether uncommitted transactions are to be rolled back. For example, assume the cache control policies specify that no thread can observe (read) data that has been speculatively written by another thread. In this case, if Thread A has written to a cache line but is still in the midst of a transaction, the microprocessor would either have to roll back Thread A or wait for its transaction to commit before allowing Thread B to observe (read) the cache line.
According to an example coherency policy, data cache controller 230 or another control mechanism maintains the following constraints on the bits:
The constraints and invariants identified above can also be expressed in part by the table below showing permitted thread states for each global state. For simplicity and ease of understanding, only two threads are shown, but the invariants/constraints in the above example policy can be extended to any number of threads.
At 604, the method includes a request by a particular thread to perform an action (a read or write) with respect to a cache location. At 606, the method includes determining whether the action is permitted, for example with reference to the Vr and Vw permissions discussed above. Assuming permissions are in place, the action is performed at 608. In some cases, as will be explained below, the method will include, at 608, making a backup copy of the data in the location before performing a write.
At 610, the method includes updating and tracking the global state GS, and updating the To and Ts indications of whether the location has been speculatively read or speculatively written. Regarding global state, an example update would be a transition from exclusive (E) to modified (M) if a clean cache line was overwritten with a new value during a transaction being performed by a thread. To would be set to YES if the action at 608 was the first read of a transaction being executed by the thread; Ts would be set to yes if the action at 608 was the first write of a transaction being executed by the thread. If To or Ts were already set to YES, those values would be maintained unless the transaction committed. Indeed, as shown at 612, the To and Ts bits are cleared upon commitment, to thereby indicate that the subject thread currently has performed no speculative actions with respect to the cache location. Commitment of a thread's transaction will similarly affect other cache locations that the thread has interacted with—i.e., To and Ts bits for the thread will be cleared for other cache locations. Process flow then returns to step 604, where the same thread or another thread attempts to perform another action on the cache location.
If the requested action is not permitted at 606, the situation is akin to a miss on the cache, and the cache controller or other control mechanism would proceed, at 614, to secure read and/or write permissions and make the necessary transitions with respect to global state GS, and Vr and Vw. For example, global state may need to transition from a shared to an exclusive state if a thread is requesting write access to a cache line. Depending on the type of action requested, Vr and/or Vw may be set to YES. At 616, the method includes determining whether any other threads need to be rolled back as a result of the processing at 606 and 614. Example conditions inducing rollback will be described in detail below. If no threads are rolled back, the requested action is performed at 608 and processing proceeds as previously described.
If one of the other threads is rolled back, at 618 the example method includes clearing the To and Ts bits for the rolled-back thread(s), in addition to all other actions needed to restore to the previously committed state (e.g., restoring the previously committed data to the cache location). In addition to or instead of restoring previously-committed data from a backup copy, the cache location/line can be invalidated so as to “expose” a previously invisible backup copy, or the cache line can be simply invalidated if the backup copy has been written back to a higher-level cache or memory. The setting of these bits to NO reflects the situation that the rolled-back thread now has been restored to a state where it has performed no speculative reads from or writes to the cache location. Rollback of a thread will also potentially affect other locations in the data cache (e.g., the To and Ts bits for the thread on other locations will be cleared, and previously committed data in other locations will be restored or exposed). The requested action that induced the rollback is then performed at 608, and processing proceeds as previously described. Similar to the clearing of To and Ts upon commitment, clearing of these bits for a thread reflects that the thread is now in a state where it has performed no speculative activity relating to the cache location.
Steps 608, 612 and 618 refer to a backup copy of data in the cache location. A backup copy may be needed prior to a speculative write if the cache location contains dirty committed data, because the most recently committed data is not replicated anywhere else in the memory system. Upon commitment of the transaction, the backup copy is no longer needed, and is therefore invalidated at 612. In the event of a rollback, the dirty committed data that existing prior to speculative overwriting must be restored, as shown at 618, or exposed, for example by invalidating the speculatively written location and promoting the backup copy to the current version, or writing it back to a higher cache or memory.
In State A, Thread 1 has both read and write permissions to the cache location, and has both read from and written to the cache location—all bits are set to YES. The transition to State B is initiated by a request by Thread 2 to read the cache location. However, in State A, Thread 2 cannot read from the location, so the necessary read permission is secured (step 614 of
The transition to State C from State A is initiated as a result of a write request from Thread 2. This results in Thread 2 having exclusive access to the cache location (both Vr and Vw set to YES in State C). In State C, Thread 2 has also proceeded to perform the requested write operation, resulting in Ts being set to YES. And as in the prior example, the rollback of Thread 1 is reflected in the clearing of its To and Ts bits. In contrast to the prior example, Thread 1 retains no permissions due to the exclusivity obtained by Thread 2 (Thread 2 is essentially placed in an invalid state with respect to the cache location).
The example transition from State A to State D in
As indicated in the above example invariants, it is legal for multiple threads to read from the same cache location, provided no thread has write access. Accordingly, if a thread has speculatively read a cache location (To=YES) but not speculatively written (Ts=NO), inducing rollback of the thread because of another thread's action depends on whether that action is a read request or a write request. In the case of a read request only, rollback is not required.
In
It should be further understood from the above examples and state bit invariants, if To or Ts is set for a thread, that thread must be rolled back before allowing another thread to write to the cache location.
Although the examples of
The above examples imply that indicating whether the threads have performed speculative write activity requires a Ts bit for each thread. However, since only one thread is permitted to write to the cache location, a single speculative bit for the cache location may be used. For purposes of identifying which thread has performed the speculative store, the cache controller can simply look for the thread that has write access, since only one thread is permitted to have write access in the above examples. Accordingly, the language herein that specifies that each thread has a write indicator Ts encompasses the situation in which that indicator is derived as a result of a two-step determination: (1) identification that the cache location has been speculatively written, and (2) identification of which thread has write access for the cache location.
In the above examples, read and write permissions are encoded with two bits, one for reading and one for writing. In another example, permissions may be encoded as exclusive, shared, or invalid. Exclusive thread permission corresponds to Vr=YES and Vw=YES in the above examples; shared thread permission corresponds to Vr=YES and Vw=NO; and invalid thread permission corresponds to Vr=NO and Vw=NO.
The example systems and methods describe above provide a number of advantages. Global state information is still provided for the cache location to identify how it is shared with other caches, e.g., caches outside the core but that are backed by a shared memory resource. The independent thread state information allows the control mechanisms to easily identify whether or not threads are allowed to perform transactional operations on a cache location. If permissions need to be upgraded to allow a transactional operation, the thread state information also enables the system to quickly identify what actions need to be taken to allow the operation to go forward.
Specifically, the system provides a mechanism for efficiently determining what a thread has done in relation to a cache location while performing a transaction. This information allows the control mechanisms to quickly identify if a thread needs to be rolled back to permit another thread to perform a requested action. If a thread has taken no actions (To=NO and Ts=NO), then there is no need for it to be rolled back when other threads seek various permissions. If one or both of To and Ts are set to YES, the need for rollback depends on what action is requested by another thread or another coherent agent outside the core.
The systems and methods described above are also applicable to implementations where a store buffer is provided in front of the data cache. In these implementations, “writing to the cache location” includes writing to a store-to-load forwarding buffer or other buffer in front of the cache. Accordingly, a thread's Ts bit may be set indicating a write to the cache location even where the store data is still sitting in a buffer in front of the cache and has not yet been retired.
Note that the method 600 may be implemented by any suitable cache controller or other control logic in a microprocessor without departing from the scope of the present disclosure. It is to be understood that the configurations and/or approaches described herein are exemplary in nature, and that these specific embodiments or examples are not to be considered in a limiting sense, because numerous variations are possible. The specific routines or methods described herein may represent one or more of any number of processing strategies. As such, various acts illustrated may be performed in the sequence illustrated, in other sequences, in parallel, or in some cases omitted. Likewise, the order of the above-described processes may be changed.
The subject matter of the present disclosure includes all novel and nonobvious combinations and subcombinations of the various processes, systems and configurations, and other features, functions, acts, and/or properties disclosed herein, as well as any and all equivalents thereof.
Number | Name | Date | Kind |
---|---|---|---|
4620217 | Songer | Oct 1986 | A |
4648045 | Demetrescu | Mar 1987 | A |
4700319 | Steiner | Oct 1987 | A |
4862392 | Steiner | Aug 1989 | A |
4901224 | Ewert | Feb 1990 | A |
5185856 | Alcorn et al. | Feb 1993 | A |
5268995 | Diefendorff et al. | Dec 1993 | A |
5285323 | Hetherington et al. | Feb 1994 | A |
5357604 | San et al. | Oct 1994 | A |
5392393 | Deering | Feb 1995 | A |
5487022 | Simpson et al. | Jan 1996 | A |
5488687 | Rich | Jan 1996 | A |
5491496 | Tomiyasu | Feb 1996 | A |
5577213 | Avery et al. | Nov 1996 | A |
5579473 | Schlapp et al. | Nov 1996 | A |
5579476 | Cheng et al. | Nov 1996 | A |
5581721 | Wada et al. | Dec 1996 | A |
5600584 | Schlafly | Feb 1997 | A |
5604824 | Chui et al. | Feb 1997 | A |
5613050 | Hochmuth et al. | Mar 1997 | A |
5655132 | Watson | Aug 1997 | A |
5701444 | Baldwin | Dec 1997 | A |
5764228 | Baldwin | Jun 1998 | A |
5777628 | Buck-Gengler | Jul 1998 | A |
5831640 | Wang et al. | Nov 1998 | A |
5850572 | Dierke | Dec 1998 | A |
5864342 | Kajiya et al. | Jan 1999 | A |
5941940 | Prasad et al. | Aug 1999 | A |
5995121 | Alcorn et al. | Nov 1999 | A |
6166743 | Tanaka | Dec 2000 | A |
6173366 | Thayer et al. | Jan 2001 | B1 |
6222550 | Rosman et al. | Apr 2001 | B1 |
6229553 | Duluk, Jr. et al. | May 2001 | B1 |
6259460 | Gossett et al. | Jul 2001 | B1 |
6288730 | Duluk, Jr. et al. | Sep 2001 | B1 |
6333744 | Kirk et al. | Dec 2001 | B1 |
6351806 | Wyland | Feb 2002 | B1 |
6353439 | Lindholm et al. | Mar 2002 | B1 |
6407740 | Chan | Jun 2002 | B1 |
6411130 | Gater | Jun 2002 | B1 |
6411301 | Parikh et al. | Jun 2002 | B1 |
6417851 | Lindholm et al. | Jul 2002 | B1 |
6466222 | Kao et al. | Oct 2002 | B1 |
6496537 | Kranawetter et al. | Dec 2002 | B1 |
6525737 | Duluk, Jr. et al. | Feb 2003 | B1 |
6526430 | Hung et al. | Feb 2003 | B1 |
6542971 | Reed | Apr 2003 | B1 |
6557022 | Sih et al. | Apr 2003 | B1 |
6597363 | Duluk, Jr. et al. | Jul 2003 | B1 |
6604188 | Coon et al. | Aug 2003 | B1 |
6624818 | Mantor et al. | Sep 2003 | B1 |
6636221 | Morein | Oct 2003 | B1 |
6664958 | Lather et al. | Dec 2003 | B1 |
6717577 | Cheng et al. | Apr 2004 | B1 |
6718542 | Kosche et al. | Apr 2004 | B1 |
6731288 | Parsons et al. | May 2004 | B2 |
6734861 | Van Dyke et al. | May 2004 | B1 |
6738870 | Van Huben et al. | May 2004 | B2 |
6778181 | Kilgariff et al. | Aug 2004 | B1 |
6806886 | Zatz | Oct 2004 | B1 |
6839828 | Gschwind et al. | Jan 2005 | B2 |
6924808 | Kurihara et al. | Aug 2005 | B2 |
6947053 | Malka et al. | Sep 2005 | B2 |
6980209 | Donham et al. | Dec 2005 | B1 |
6980222 | Marion et al. | Dec 2005 | B2 |
6999100 | Leather et al. | Feb 2006 | B1 |
7034828 | Drebin et al. | Apr 2006 | B1 |
7158141 | Chung et al. | Jan 2007 | B2 |
7187383 | Kent | Mar 2007 | B2 |
7225299 | Rozas et al. | May 2007 | B1 |
7257814 | Melvin et al. | Aug 2007 | B1 |
7280112 | Hutchins | Oct 2007 | B1 |
7298375 | Hutchins | Nov 2007 | B1 |
7376798 | Rozas | May 2008 | B1 |
7430654 | Huang et al. | Sep 2008 | B2 |
7450120 | Hakura et al. | Nov 2008 | B1 |
7477260 | Nordquist | Jan 2009 | B1 |
7659909 | Hutchins | Feb 2010 | B1 |
7710427 | Hutchins et al. | May 2010 | B1 |
7873793 | Rozas et al. | Jan 2011 | B1 |
7928990 | Jiao et al. | Apr 2011 | B2 |
7941645 | Riach et al. | May 2011 | B1 |
7969446 | Hutchins et al. | Jun 2011 | B2 |
8537168 | Steiner et al. | Sep 2013 | B1 |
8773447 | Donham | Jul 2014 | B1 |
8860722 | Cabral et al. | Oct 2014 | B2 |
20020105519 | Lindholm et al. | Aug 2002 | A1 |
20020126126 | Baldwin | Sep 2002 | A1 |
20020129223 | Takayama et al. | Sep 2002 | A1 |
20020169942 | Sugimoto | Nov 2002 | A1 |
20030115233 | Hou et al. | Jun 2003 | A1 |
20030189565 | Lindholm et al. | Oct 2003 | A1 |
20040012597 | Zatz et al. | Jan 2004 | A1 |
20040012599 | Laws | Jan 2004 | A1 |
20040012600 | Deering et al. | Jan 2004 | A1 |
20040024260 | Winkler et al. | Feb 2004 | A1 |
20040100474 | Demers et al. | May 2004 | A1 |
20040114813 | Boliek et al. | Jun 2004 | A1 |
20040119710 | Piazza et al. | Jun 2004 | A1 |
20040126035 | Kyo | Jul 2004 | A1 |
20040130552 | Duluk, Jr. et al. | Jul 2004 | A1 |
20040194084 | Matsunami et al. | Sep 2004 | A1 |
20040246260 | Kim et al. | Dec 2004 | A1 |
20050086644 | Chkodrov et al. | Apr 2005 | A1 |
20050122330 | Boyd et al. | Jun 2005 | A1 |
20050134588 | Aila et al. | Jun 2005 | A1 |
20050135433 | Chang et al. | Jun 2005 | A1 |
20050162436 | Van Hook et al. | Jul 2005 | A1 |
20050223195 | Kawaguchi | Oct 2005 | A1 |
20050231506 | Simpson et al. | Oct 2005 | A1 |
20050237337 | Leather et al. | Oct 2005 | A1 |
20050280655 | Hutchins et al. | Dec 2005 | A1 |
20060007234 | Hutchins et al. | Jan 2006 | A1 |
20060028469 | Engel | Feb 2006 | A1 |
20060152519 | Hutchins et al. | Jul 2006 | A1 |
20060155964 | Totsuka | Jul 2006 | A1 |
20060177122 | Yasue | Aug 2006 | A1 |
20060288195 | Ma et al. | Dec 2006 | A1 |
20070030278 | Prokopenko et al. | Feb 2007 | A1 |
20070165029 | Lee et al. | Jul 2007 | A1 |
20070236495 | Gruber et al. | Oct 2007 | A1 |
20070279408 | Zheng et al. | Dec 2007 | A1 |
20070285427 | Morein et al. | Dec 2007 | A1 |
20070288902 | Lev et al. | Dec 2007 | A1 |
20100023707 | Hohmuth et al. | Jan 2010 | A1 |
20100211933 | Kiel et al. | Aug 2010 | A1 |
20120144126 | Nimmala et al. | Jun 2012 | A1 |
20140181404 | Chaudhary et al. | Jun 2014 | A1 |
20140372990 | Strauss | Dec 2014 | A1 |
Number | Date | Country |
---|---|---|
1954338 | May 2004 | CN |
101091203 | May 2004 | CN |
1665165 | May 2004 | EP |
1745434 | May 2004 | EP |
1771824 | May 2004 | EP |
05150979 | Jun 1993 | JP |
11053187 | Feb 1999 | JP |
2000047872 | Feb 2000 | JP |
2002073330 | Mar 2002 | JP |
2002171401 | Jun 2002 | JP |
2004199222 | Jul 2004 | JP |
2006196004 | Jul 2006 | JP |
2008161169 | Jul 2008 | JP |
2005112592 | May 2004 | WO |
2006007127 | May 2004 | WO |
2005114582 | Dec 2005 | WO |
2005114646 | Dec 2005 | WO |
Entry |
---|
“Interleaved Memory.” Dec. 26, 2002. http://www.webopedia.com/TERM/I/interleaved—memory.html. |
Pirazzi, Chris. “Fields, F1/F2, Interleave, Field Dominance and More.” Nov. 4, 2001. http://lurkertech.com/lg/dominance.html. |
Hennessy, et al., Computer Organization and Design: The Hardware/Software Interface, 1997, Section 6.5. |
Moller, et al.; Real-Time Rendering, 2nd ed., 2002, A K Peters Ltd., pp. 92-99, 2002. |
Hollasch; IEEE Standard 754 Floating Point Numbers; http://steve.hollasch.net/cgindex/coding/ieeefloat.html; dated Feb. 24, 2005; retrieved Oct. 21, 2010. |
Microsoft; (Complete) Tutorial to Understand IEEE Floating-Point Errors; http://support.microsoft.com/kb/42980; dated Aug. 16, 2005; retrieved Oct. 21, 2010. |
The Free Online Dictionary, Thesaurus and Encyclopedia, definition for cache; http://www.thefreedictionary.com/cache; retrieved Aug. 17, 2012. |
Wolfe A, et al., “A Superscalar 3D graphics engine”, MICRO-32. Proceedings of the 32nd annual ACM/IEEE International Symposium on Microarchitecture. Haifa, Israel, Nov. 16-18, 1999. |
Zaharieva-Stoyanova E I: “Data-flow analysis in superscalar computer architecture execution,” Tellecommunications in Modern Satellite, Cable and Broadcasting Services, 2003. |
“Sideband,” http://www.encyclopedia.com/html/s1/sideband.asp. |
Pixar, Inc.; PhotoRealistic RenderMan 3.9 Shading Language Extensions; Sep. 1999. |
PCT Notificaiton of Transmittal of the International Search Report and the Written Opinion of the International Searching Authority, or the Declaration. PCT/US05/17032; Applicant NVIDA Corporation; Mail Date Nov. 9, 2005. |
PCT Notificaiton of Transmittal of The International Search Report or the Declaration. PCT/US05/17526; Applicant Hutchins, Edward A; Mail Date Jan. 17, 2006. |
PCT Notificaiton of Transmittal of The International Search Report and The Written Opinion of the International Searching Authority, or the Declaration. PCT/US05/17031; Applicant NVIDA Corporation; Mail Date Feb. 9, 2007. |
Number | Date | Country | |
---|---|---|---|
20130326153 A1 | Dec 2013 | US |