1. Field of the Invention
Embodiments of the present invention relate to computer systems.
2. Related Art
Contemporary computer systems have the capacity to execute more instructions than they actually execute in practice. Improvements in performance are limited by, for example, the latencies associated with accessing memory or input/output devices. Some of the excess capacity can be taken advantage of by executing two or more threads in parallel (commonly referred to as “multi-threading”). In simple terms, a set of computational resources are applied to a first thread until a long-latency event (e.g., a main memory access) is encountered, then the resources are applied to a second thread until another long-latency event is encountered, and so on. By switching execution from one thread to another, processor cycles that would otherwise be idle are instead put to use, realizing a gain in performance.
Improved multi-threading methods, and systems thereof, are described. According to one embodiment of the present invention, a first thread is executed. Context for the executing thread is maintained in a working register. Execution of the first thread is halted and execution of a second thread is begun by performing a rollback operation. The rollback operation causes context for the second thread to be copied from a shadow register into the working register.
The accompanying drawings, which are incorporated in and form a part of this specification, illustrate embodiments of the present invention and, together with the description, serve to explain the principles of the invention. The drawings referred to in this description should not be understood as being drawn to scale except if specifically noted.
Reference will now be made in detail to the various embodiments of the invention, examples of which are illustrated in the accompanying drawings. While the invention will be described in conjunction with these embodiments, it will be understood that they are not intended to limit the invention to these embodiments. On the contrary, the invention is intended to cover alternatives, modifications and equivalents, which may be included within the spirit and scope of the invention as defined by the appended claims. Furthermore, in the following detailed description of the present invention, numerous specific details are set forth in order to provide a thorough understanding of the present invention. However, it will be recognized by one of ordinary skill in the art that the present invention may be practiced without these specific details. In other instances, well-known methods, procedures, components, and circuits have not been described in detail as not to unnecessarily obscure aspects of the present invention.
Some portions of the detailed descriptions that follow are presented in terms of procedures, logic blocks, processing, and other symbolic representations of operations on data bits within a computer memory. These descriptions and representations are the means used by those skilled in the data processing arts to most effectively convey the substance of their work to others skilled in the art. A procedure, logic block, process, etc., is here, and generally, conceived to be a self-consistent sequence of steps or instructions leading to a desired result. The steps are those requiring physical manipulations of physical quantities. Usually, though not necessarily, these quantities take the form of electrical or magnetic signals capable of being stored, transferred, combined, compared, and otherwise manipulated in a computer system. It has proven convenient at times, principally for reasons of common usage, to refer to these signals as bits, bytes, values, elements, symbols, characters, terms, numbers, or the like.
It should be borne in mind, however, that all of these and similar terms are to be associated with the appropriate physical quantities and are merely convenient labels applied to these quantities. Unless specifically stated otherwise as apparent from the following discussions, it is appreciated that throughout the present invention, discussions utilizing terms such as “executing,” “switching,” “dismissing,” “detecting,” “swapping,” “storing,” “holding,” “copying,” “performing,” “instantiating,” “identifying” or the like, refer to the action and processes (e.g., flowcharts 50, 60 and 70 of
Aspects of the present invention may be practiced on a computer system that includes, in general, a central processing unit (CPU) for processing information and instructions, random access (volatile) memory (RAM) for storing information and instructions, read-only (non-volatile) memory (ROM) for storing static information and instructions, a data storage device such as a magnetic or optical disk and disk drive for storing information and instructions, an optional user output device such as a display device (e.g., a monitor) for displaying information to the computer user, an optional user input device including alphanumeric and function keys (e.g., a keyboard) for communicating information and command selections to the processor, and an optional user input device such as a cursor control device (e.g., a mouse) for communicating user input information and command selections to the processor. The computer system may also include an input/output (I/O) device for providing a physical communication link between the computer system and a peripheral device or a network, using either a wired or a wireless communication interface.
A rollback operation provides a mechanism for recovering from an invalid assumption. In a rollback operation, context in the shadow register (the information placed into the shadow register at commit point A) is copied into the working register. A rollback operation also rolls back or dismisses any speculative stores that occurred during execution of a thread. Thus, in essence, the rollback operation restores the state of the processor that existed at commit point A. Speculation can then begin again from commit point A. Speculation may proceed along a different path that will avoid the invalid assumption encountered in the prior speculation.
Commit and rollback operations, and dismissing speculative stores as part of a rollback operation, are described in the following patents, assigned to the assignee of the present invention and hereby incorporated by reference in their entirety: U.S. Pat. Nos. 5,832,205; 5,905,855; 5,926,832; 5,958,061; 6,011,908; 6,031,992 and 6,199,152.
Working register 31 (which may also be referred to as a foreground register) is for holding context (e.g., state information) for the thread currently being executed by processor 35. In general, as used herein, a “thread” refers to a part of a program that can execute independently of other parts of the program. In one embodiment, processor 35 implements a method of multi-threading in which a single thread is executed at a time. Computer system 30 may include multiples of such processors, allowing multiple threads to be executed simultaneously.
In the present embodiment, each of shadow registers 1, 2, . . . , N (which may also be referred to as architectural or background registers) holds context for a respective thread. Context is copied into one of the shadow registers 1, 2, . . . , N as the result of a commit operation that occurs during execution of a thread. For example, consider a first thread that is executing on processor 35. At some point during the execution of the first thread, a commit operation (CMT1) is performed, in which context for the first thread is copied from working register 31 to one of the shadow registers (e.g., shadow register 1). As execution of the first thread continues, the information in working register 31 will change, and subsequent commit operations will cause the context in working register 31 to be copied to shadow register 1, rewriting the context previously stored in shadow register 1. At some point during execution of the first thread, processor 35 switches execution from the first thread to a second thread. The second thread may be from the same program as the first thread, or from a program that is different from the program associated with the first thread. The switch may be triggered by a timer interrupt, a cache miss, an I/O access, or some other type of “long-latency event” associated with the first thread. At some point during execution of the second thread, a commit operation (CMT2) is performed. In the present embodiment, as a result of the CMT2 operation, context for the second thread is copied from working register 31 to a shadow register other than shadow register 1 (e.g., to shadow register 2).
During execution of a thread, a rollback operation may occur. As a result of the rollback operation, the context contained in a shadow register is copied into working register 31. In the present embodiment, as a result of a rollback operation during execution of the aforementioned first thread, the context contained in shadow register 1 is copied to working register 31 (RLBK1). Similarly, in the present embodiment, a rollback operation also causes the context contained in shadow register 2 to be copied into working register 1 (RLBK2). Recall that a shadow register contains context that was received at the most recent commit point. A rollback operation thus restores, for the thread currently being executed, the state of working register 31 to the state that existed at the last commit point preceding the rollback operation.
In one embodiment, register 32 contains information that is sufficient for identifying which of the shadow registers 1, 2, . . . , N is associated with the executing thread. Consider again the example above, in which the first thread is executing and a commit operation (e.g., CMT1) is performed, causing context for the first thread to be copied from working register 31 to shadow register 1. According to the present embodiment, register 32 includes information that associates shadow register 1 with the first thread. In general, processor 35 is able to determine which shadow register is to receive a copy of the information in working register 31 as the result of a commit operation, and processor 35 is able to determine which shadow register is to provide context for the executing thread as the result of a rollback operation.
Using the memory architecture described in conjunction with
Note that, in the example above, a commit operation does not have to occur between, for example, RLBK1 and RLBK2. That is, in general, it is not necessary to commit context for an executing thread prior to using a rollback operation to switch execution to another thread.
In one embodiment, the following actions are performed to switch execution (and context) from one thread to another:
i) Roll back context for the executing (foreground) thread (e.g., the first thread);
ii) Set thread identifier to identify the second thread as the executing thread; and
iii) Rollback to install into working register 31 the context for the new thread (e.g., the second thread).
Although two rollbacks are mentioned in the example above, the actions associated with the rollbacks can be accomplished with a single rollback operation.
In the example above, the rollback operation also rolls back or dismisses any speculative stores that occurred during execution of the first thread. In general, when execution is switched from one thread to another thread, speculative memory stores associated with the first thread are dismissed. Thus, memory coherence is maintained when execution is switched from one thread to another.
In the example of
In one embodiment, the following actions are performed to switch execution (and context) from one thread to another:
i) Roll back context for the executing (foreground) thread (e.g., the first thread);
ii) Swap context information between shadow registers; and
iii) Rollback to install into working register 31 the context for the new thread (e.g., the second thread).
Although two rollbacks are mentioned in the example above, the actions associated with the rollbacks can be accomplished with a single rollback operation.
In the sequence above, the rollback operation dismisses any speculative stores that occurred during execution of the first thread. Thus, memory coherence is maintained when execution is switched from one thread to another.
The descriptions above are based on examples that use two threads; however, these examples can be readily extended to situations involving more than two threads. In a multiple thread environment, the computer's operating system is informed of the number of threads that can be supported in hardware (e.g., the number of shadow registers). In addition to the types of operations described above, the operating system includes instructions that enable it to identify the executing thread, to interrupt the executing thread, and to halt the executing thread when the thread is in a consistent state. Furthermore, to switch execution from a current thread to a new thread, the operating system can include instructions that enable it to extract the context of the new thread and to insert that context into the executing stream of instructions.
With reference first to
In step 52 of
In step 53 of
In step 54 of
Reference is now made to
In step 62 of
Reference is now made to
In step 72 of
In summary, embodiments in accordance with the present invention describe multi-threading methods and systems that use a context switch that is implemented using a working register and a number of shadow registers. An advantage to implementing the context switch in the manner described herein is that memory pipes, ports into register files, bypassing networks, etc., only deal with the size of the context contained in the working register for the executing thread, and not with the extended (and extendable) size introduced by the contexts contained in the shadow registers for each of the additional threads.
Another advantage provided by embodiments in accordance with the present invention is that contemporary operating systems and processors can be readily adapted or extended to implement the context switch being implemented as described herein. For example, the context switch described herein is implemented using commit and rollback commands that, except for the additional functionality provided by the present invention, are known in the art. That is, the commit operation is still used to copy the state of a working register, and the rollback operation is still used to restore the working register to an earlier state. However, by introducing multiple shadow registers, the states for multiple threads can be stored and restored. Also, except perhaps for maintaining information identifying which shadow register is associated with which thread, the context switch implemented as described herein is virtually invisible to the computer system.
Embodiments in accordance with the present invention are thus described. While the present invention has been described in particular embodiments, it should be appreciated that the present invention should not be construed as limited by such embodiments, but rather construed according to the below claims.
| Number | Name | Date | Kind |
|---|---|---|---|
| 4435459 | McKinney et al. | Mar 1984 | A |
| 4794522 | Simpson | Dec 1988 | A |
| 4794552 | Burn | Dec 1988 | A |
| 5167023 | de Nicolas et al. | Nov 1992 | A |
| 5361389 | Fitch | Nov 1994 | A |
| 5363366 | Wisdom, Jr. et al. | Nov 1994 | A |
| 5494821 | Takahashi et al. | Feb 1996 | A |
| 5537559 | Kane et al. | Jul 1996 | A |
| 5568380 | Brodnax et al. | Oct 1996 | A |
| 5596390 | Sawada | Jan 1997 | A |
| 5625835 | Ebcioglu et al. | Apr 1997 | A |
| 5636366 | Robinson et al. | Jun 1997 | A |
| 5649136 | Shen et al. | Jul 1997 | A |
| 5668969 | Fitch | Sep 1997 | A |
| 5692169 | Kathail et al. | Nov 1997 | A |
| 5721927 | Baraz et al. | Feb 1998 | A |
| 5724590 | Goettelmann et al. | Mar 1998 | A |
| 5748936 | Karp et al. | May 1998 | A |
| 5751942 | Christensen et al. | May 1998 | A |
| 5751982 | Morley | May 1998 | A |
| 5757942 | Kamatani et al. | May 1998 | A |
| 5761467 | Ando | Jun 1998 | A |
| 5784585 | Denman | Jul 1998 | A |
| 5790625 | Arimilli | Aug 1998 | A |
| 5790825 | Traut | Aug 1998 | A |
| 5832202 | Slavenburg et al. | Nov 1998 | A |
| 5832205 | Kelly et al. | Nov 1998 | A |
| 5842017 | Hookway et al. | Nov 1998 | A |
| 5867681 | Worrell et al. | Feb 1999 | A |
| 5875318 | Langford | Feb 1999 | A |
| 5915117 | Ross et al. | Jun 1999 | A |
| 5925123 | Tremblay et al. | Jul 1999 | A |
| 5948112 | Shimada et al. | Sep 1999 | A |
| 6011908 | Wing et al. | Jan 2000 | A |
| 6031992 | Cmelik et al. | Feb 2000 | A |
| 6032244 | Moudgill | Feb 2000 | A |
| 6044450 | Tsushima et al. | Mar 2000 | A |
| 6052708 | Flynn et al. | Apr 2000 | A |
| 6091897 | Yates et al. | Jul 2000 | A |
| 6164841 | Mattson, Jr. et al. | Dec 2000 | A |
| 6199152 | Kelly et al. | Mar 2001 | B1 |
| 6295600 | Parady | Sep 2001 | B1 |
| 6308318 | Krishnaswamy | Oct 2001 | B2 |
| 6351844 | Bala | Feb 2002 | B1 |
| 6356615 | Coon et al. | Mar 2002 | B1 |
| 6363336 | Banning et al. | Mar 2002 | B1 |
| 6408325 | Shaylor | Jun 2002 | B1 |
| 6415379 | Keppel et al. | Jul 2002 | B1 |
| 6438677 | Chaudhry et al. | Aug 2002 | B1 |
| 6463582 | Lethin et al. | Oct 2002 | B1 |
| 6594821 | Banning et al. | Jul 2003 | B1 |
| 6615300 | Banning et al. | Sep 2003 | B1 |
| 6704925 | Bugnion | Mar 2004 | B1 |
| 6714904 | Torvalds et al. | Mar 2004 | B1 |
| 6738892 | Coon et al. | May 2004 | B1 |
| 6845353 | Bedichek et al. | Jan 2005 | B1 |
| 6990658 | Torvalds et al. | Jan 2006 | B1 |
| 7089404 | Rozas et al. | Aug 2006 | B1 |
| 7096460 | Banning et al. | Aug 2006 | B1 |
| 7107580 | Zemach et al. | Sep 2006 | B2 |
| 7111096 | Banning et al. | Sep 2006 | B1 |
| 7310723 | Rozas et al. | Dec 2007 | B1 |
| 7331041 | Torvalds et al. | Feb 2008 | B1 |
| 7404181 | Banning et al. | Jul 2008 | B1 |
| 7475230 | Chou et al. | Jan 2009 | B2 |
| 7761857 | Bedichek et al. | Jul 2010 | B1 |
| 7904891 | Banning et al. | Mar 2011 | B2 |
| 8019983 | Rozas et al. | Sep 2011 | B1 |
| 8176489 | Bauer et al. | May 2012 | B2 |
| 20010047468 | Parady | Nov 2001 | A1 |
| 20020092002 | Babaian et al. | Jul 2002 | A1 |
| 20040015967 | Morris | Jan 2004 | A1 |
| 20050060705 | Katti et al. | Mar 2005 | A1 |
| 20060020776 | Yoshida | Jan 2006 | A1 |
| 20060150194 | Xing et al. | Jul 2006 | A1 |
| 20070006189 | Li et al. | Jan 2007 | A1 |
| 20090125913 | Bradford et al. | May 2009 | A1 |
| 20100262955 | Bedichek et al. | Oct 2010 | A1 |
| 20120036502 | Banning et al. | Feb 2012 | A1 |
| 20120079257 | Rozas et al. | Mar 2012 | A1 |
| Number | Date | Country |
|---|---|---|
| 0908820 | Apr 1999 | EP |
| 0148605 | Jul 2001 | WO |
| Entry |
|---|
| Hwu et al. “Checkpoint repair for out-of-order execution machines”, Proceedings of the 14th annual international symposium on Computer architecture, 1987, pp. 18-26. |
| Ung et al.; “Dynamic Re-Engineering of Binary Code With Run-Time Feedbacks;” Department of Computer Science and Electrical Engineering, University of Queensland, QLD, Australia; 2000. |
| Ung et al.; “Machine-Adaptable Dynamic Binary Translation;” Proceedings of the ACM Sigplan Workshop on Dynamic and Adapative Compilation and Optimization, Jan. 2000 pp. 30-40. |
| Holzle, Urs, Adaptive Optimization for SELF: Reconciling High Performance with Exploratory Programming, Dotorial Dissertation, Aug. 1994. |
| Cifuentes, Cristina and Malhotra, Vishv, Binary Translation: Static, Dynamic, Retargetable?, International Conference on Software Mainteance, Nov. 4-8, 1996. |
| Bala et al.; “Transparent Dynamic Optimization: The Design and Implementation of Dynamo;” HP Laboratories Cambridge HPL-1999-78; Jun. 1999. |
| Ex Parte Quayle Dated Jul. 2, 2008; U.S. Appl. No. 09/417,332. |
| Final OA Dated Jan. 18, 2007; U.S. Appl. No. 09/417,332. |
| Final OA Dated Feb. 13, 2004; U.S. Appl. No. 09/417,332. |
| Non Final OA Dated May 10, 2004; U.S. Appl. No. 09/417,332. |
| Non Final OA Dated Jan. 30, 2008; U.S. Appl. No. 09/417,332. |
| Non Final OA Dated May 13, 2003; U.S. Appl. No. 09/417,332. |
| Non Final OA Dated Jul. 26, 2007; U.S. Appl. No. 09/417,332. |
| Notice of Allowance Dated Jan. 12, 2009; U.S. Appl. No. 09/417,332. |
| Notice of Allowance Dated Sep. 8, 2008; U.S. Appl. No. 09/417,332. |
| Non Final OA Dated Mar. 29, 2006; U.S. Appl. No. 10/406,022. |
| Notice of Allowance Dated Jul. 27, 2007; U.S. Appl. No. 10/406,022. |
| Notice of Allowance Dated Feb. 21, 2003; U.S. Appl. No. 09/539,987. |
| Interview Summary Dated Apr. 14, 2003; U.S. Appl. No. 09/417,358. |
| Non Final OA Dated Apr. 25, 2002; U.S. Appl. No. 09/417,358. |
| Non Final OA Dated Dec. 20, 2000; U.S. Appl. No. 09/417,358. |
| Notice of Allowance Dated Jun. 20, 2003; U.S. Appl. No. 09/417,358. |
| OA Dated Sep. 6, 2001; U.S. Appl. No. 09/417,358. |
| Final Office Action Dated Jan. 16, 2009; U.S. Appl. No. 12/002,983. |
| Non Final Office Action Dated Aug. 5, 2008; U.S. Appl. No. 12/002,983. |
| Non Final Office Action; Mail Date May 26, 2009; U.S. Appl. No. 09/417,332. |
| Notice of Allowance; Mail Date Nov. 17, 2009; U.S. Appl. No. 09/417,332. |
| Non Final Office Action; Mail Date May 19, 2009; U.S. Appl. No. 12/002,983. |
| Non Final Office Action; Mail Date Dec. 2, 2009; U.S. Appl. No. 12/002,983. |
| Advisory Action; Mail Date Dec. 31, 2009; U.S. Appl. No. 12/177,836. |
| Final Office Action; Mail Date Oct. 22, 2009; U.S. Appl. No. 12/177,836. |
| Notice of Allowance Dated May 9, 2011, U.S. Appl. No. 12/002,983. |
| Final Office Action, Mailed Apr. 27, 2011; U.S. Appl. No. 12/578,500. |
| Notice of Allowance, Mailed May 11, 2011; U.S. Appl. No. 12/002,983. |
| Final Office Action Dated Aug. 3, 2012; U.S. Appl. No. 12/578,500. |
| Non-Final Office Action Dated Jul. 9, 2012; U.S. Appl. No. 13/221,812. |