An embodiment of the invention relates to computer operation in general, and more specifically to inter-processor interrupts.
A computer may include multiple processors, which may include physical and logical processors. Operating systems may utilize inter-processor interrupts (IPIs) to transfer requests between processors in a system. An operating system may use an inter-processor interrupts in order to have one processor initiate specific actions for one or more other processors. Such actions may include a TLB (translation look-aside buffer) shootdown interrupt, in which a processor sends an interrupt to other processor to request invalidation of a TLB entry. Cache flushing may be initiated by receiving processors in response to a global change made by a sending processor, such as changes in the linear address mappings or changes in the memory caching attributes for a particular memory range.
However, inter-processor interrupt signals may require a large overhead for both the sending processor side and the receiving processor side. The sending processor needs to perform memory accesses to send an interrupt through a programmable interrupt controller, such as a local advanced programmable interrupt controller (APIC). In turn, the receiving processor may absorb considerable overhead in the process of receiving an interrupt.
The invention may be best understood by referring to the following description and accompanying drawings that are used to illustrate embodiments of the invention. In the drawings:
A method and apparatus are described for inter-processor interrupts in a multi-processor system.
Under an embodiment of the invention, an inter-processor interrupt function is performed using an instruction for calling the interrupt. The instruction is referred to herein as an Mcall instruction, although the instruction can have any designation. In the embodiment, the operational cost of the function to the sending processor side is a store to a writeback memory location, with the cost to the receiving side being a forced call to a function. An embodiment of the invention may greatly reduce the operational cost of inter-processor interrupts, thereby improving system performance.
According to an embodiment of the invention, an interrupt function is performed by a signal that is sent through the memory system. The sending processor performs a store to a writeback memory location. The store thereby triggers a function call on the receiving side. The operation may be contrasted to a conventional interrupt that is sent through the APIC. The embodiment may allow improved operating system performance in multi-processor and multi-threaded environments by reducing the cost of sending inter-processor interrupts. Under an embodiment of the invention, an inter-processor interrupt function may be performed without an APIC or in systems with alternative signal operations.
A conventional mechanism for sending an inter-processor interrupt is illustrated in
On the receiving processor, the interrupt is conventionally latched and delivered to the processor core via logic incorporated in the local APIC interrupt delivery mechanism. The illustrated interrupt mechanism takes into account the interrupt priority under which the processor core is operating (as is reflected in an APIC task priority register), other pending interrupts that may have higher priority, and the interruptibility state of the processor's core. When the processor core has interrupts enabled and the vector corresponding to the inter-processor interrupt is the highest priority interrupt vector pending, then the local APIC dispatches the vector to the core.
For a receiving processor, a conventional sequence of events is illustrated in
Under an embodiment of the invention, the use of an instruction (a Mcall instruction in this description) for the operation of an interrupt may simplify the operational sequence for the sending processor and the receiving processor. At boot time, each processor in a multi-processor system may register a function, the function corresponding to an interrupt service routine that would have executed in kernel mode on receipt of an interrupt service routine, such as the inter-processor interrupt function via a Mcall instruction. However, this operation may alternately be accomplished by other mechanisms, including the use of model specific registers.
The example shown in
Under this embodiment, the Mcall instruction puts the receiving processor in a state in which the processor monitors the linear address X for writes and, upon detection of a write operation, the receiving processor transfers execution control to the IPI ISR linear address. A ring transition is performed as needed, with the appropriate state established on the stack and the processor priority level raised to the appropriate priority.
The receiving processor 510 monitoring 535 the linear address X 520 is notified of the interrupt request when a write to linear address X 520 occurs. In kernel mode, the receiving processor will have established a state for enabling ring transition on receipt of an inter-processor interrupt. When the interrupt is received, the current state of the receiving processor is saved 540. The linear processor performs the interrupt, with the call for the interrupt being shown as Mcall <IPI ISR Linear Address> 545. The performance of the function may include writing 550 to the memory location Y 530 being polled 525 by the sending processor 505. Upon detecting a change in value in memory location Y 530, the sending processor may resume normal operation. Upon completing the inter-processor interrupt, the receiving processor may resume normal operation.
Techniques described here may be used in many different environments.
Under an embodiment of the invention, a computer 600 comprises a bus 605 or other communication means for communicating information, and a processing means such as one or more processors 610 (shown as 611, 612 and continuing through 613) coupled with the first bus 605 for processing information. Any of the processors 610 may provide an inter-processor interrupt to one or more of the other processors. Each processor may comprise an execution unit and logic for inter-processor interrupt operation.
The computer 600 further comprises a random access memory (RAM) or other dynamic storage device as a main memory 615 for storing information and instructions to be executed by the processors 610. Main memory 615 also may be used for storing temporary variables or other intermediate information during execution of instructions by the processors 610. The computer 600 also may comprise a read only memory (ROM) 620 and/or other static storage device for storing static information and instructions for the processor 610.
A data storage device 625 may also be coupled to the bus 605 of the computer 600 for storing information and instructions. The data storage device 625 may include a magnetic disk or optical disc and its corresponding drive, flash memory or other nonvolatile memory, or other memory device. Such elements may be combined together or may be separate components, and utilize parts of other elements of the computer 600.
The computer 600 may also be coupled via the bus 605 to a display device 630, such as a liquid crystal display (LCD) or other display technology, for displaying information to an end user. In some environments, the display device may be a touch-screen that is also utilized as at least a part of an input device. In some environments, display device 630 may be or may include an auditory device, such as a speaker for providing auditory information. An input device 640 may be coupled to the bus 605 for communicating information and/or command selections to the processor 610. In various implementations, input device 640 may be a keyboard, a keypad, a touch-screen and stylus, a voice-activated system, or other input device, or combinations of such devices. Another type of user input device that may be included is a cursor control device 645, such as a mouse, a trackball, or cursor direction keys for communicating direction information and command selections to processor 610 and for controlling cursor movement on display device 630.
A communication device 650 may also be coupled to the bus 605. Depending upon the particular implementation, the communication device 650 may include a transceiver, a wireless modem, a network interface card, or other interface device. The computer 600 may be linked to a network or to other devices using the communication device 650, which may include links to the Internet, a local area network, or another environment.
In the description above, for the purposes of explanation, numerous specific details are set forth in order to provide a thorough understanding of the present invention. It will be apparent, however, to one skilled in the art that the present invention may be practiced without some of these specific details. In other instances, well-known structures and devices are shown in block diagram form.
The present invention includes various steps. The steps of the present invention may be performed by hardware components or may be embodied in machine-executable instructions, which may be used to cause a general-purpose or special-purpose processor or logic circuits programmed with the instructions to perform the steps. Alternatively, the steps may be performed by a combination of hardware and software.
Portions of the present invention may be provided as a computer program product, which may include a machine-readable medium having stored thereon instructions, which may be used to program a computer (or other electronic devices) to perform a process according to the present invention. The machine-readable medium may include, but is not limited to, floppy diskettes, optical disks, CD-ROMs, and magneto-optical disks, ROMs, RAMs, EPROMs, EEPROMs, magnet or optical cards, flash memory, or other type of media/machine-readable medium suitable for storing electronic instructions. Moreover, the present invention may also be downloaded as a computer program product, wherein the program may be transferred from a remote computer to a requesting computer by way of data signals embodied in a carrier wave or other propagation medium via a communication link (e.g., a modem or network connection).
Many of the methods are described in their most basic form, but steps can be added to or deleted from any of the methods and information can be added or subtracted from any of the described messages without departing from the basic scope of the present invention. It will be apparent to those skilled in the art that many further modifications and adaptations can be made. The particular embodiments are not provided to limit the invention but to illustrate it. The scope of the present invention is not to be determined by the specific examples provided above but only by the claims below.
It should also be appreciated that reference throughout this specification to “one embodiment” or “an embodiment” means that a particular feature may be included in the practice of the invention. Similarly, it should be appreciated that in the foregoing description of exemplary embodiments of the invention, various features of the invention are sometimes grouped together in a single embodiment, figure, or description thereof for the purpose of streamlining the disclosure and aiding in the understanding of one or more of the various inventive aspects. This method of disclosure, however, is not to be interpreted as reflecting an intention that the claimed invention requires more features than are expressly recited in each claim. Rather, as the following claims reflect, inventive aspects lie in less than all features of a single foregoing disclosed embodiment. Thus, the claims are hereby expressly incorporated into this description, with each claim standing on its own as a separate embodiment of this invention.
Number | Name | Date | Kind |
---|---|---|---|
4349872 | Fukasawa et al. | Sep 1982 | A |
4365294 | Stokken | Dec 1982 | A |
4514825 | Nordling | Apr 1985 | A |
4604500 | Brown et al. | Aug 1986 | A |
4713770 | Hayes et al. | Dec 1987 | A |
4740910 | Sakata et al. | Apr 1988 | A |
4768149 | Konopik et al. | Aug 1988 | A |
4930070 | Yonekura et al. | May 1990 | A |
5123094 | MacDougall | Jun 1992 | A |
5136714 | Braudaway et al. | Aug 1992 | A |
5142683 | Burkhardt et al. | Aug 1992 | A |
5146597 | Williams | Sep 1992 | A |
5214759 | Yamaoka et al. | May 1993 | A |
5283904 | Carson et al. | Feb 1994 | A |
5303378 | Cohen | Apr 1994 | A |
5410654 | Foster et al. | Apr 1995 | A |
5410710 | Sarangdhar et al. | Apr 1995 | A |
5524208 | Finch et al. | Jun 1996 | A |
5535397 | Durante et al. | Jul 1996 | A |
5544051 | Senn et al. | Aug 1996 | A |
5553293 | Andrews et al. | Sep 1996 | A |
5717895 | Leedom et al. | Feb 1998 | A |
5765195 | McDonald | Jun 1998 | A |
5781187 | Gephardt et al. | Jul 1998 | A |
5848279 | Wu et al. | Dec 1998 | A |
5864701 | Col et al. | Jan 1999 | A |
5918028 | Silverthorn et al. | Jun 1999 | A |
5966543 | Hartner et al. | Oct 1999 | A |
5974486 | Siddappa | Oct 1999 | A |
5978865 | Hansen et al. | Nov 1999 | A |
5983356 | Pandey et al. | Nov 1999 | A |
6041374 | Postman et al. | Mar 2000 | A |
6047391 | Younis et al. | Apr 2000 | A |
6088783 | Morton | Jul 2000 | A |
6128732 | Chaiken | Oct 2000 | A |
6145007 | Dokic et al. | Nov 2000 | A |
6148361 | Carpenter et al. | Nov 2000 | A |
6173358 | Combs | Jan 2001 | B1 |
6192441 | Athenes et al. | Feb 2001 | B1 |
6247091 | Lovett | Jun 2001 | B1 |
6247143 | Williams | Jun 2001 | B1 |
6249830 | Mayer et al. | Jun 2001 | B1 |
6265885 | Luo et al. | Jul 2001 | B1 |
6269419 | Matsuyama | Jul 2001 | B1 |
6279067 | Callway et al. | Aug 2001 | B1 |
6295573 | Bailey et al. | Sep 2001 | B1 |
6317826 | McCall et al. | Nov 2001 | B1 |
6339808 | Hewitt et al. | Jan 2002 | B1 |
6356963 | Maguire et al. | Mar 2002 | B1 |
6378023 | Christie et al. | Apr 2002 | B1 |
6393503 | Fishler et al. | May 2002 | B2 |
6418496 | Pawlowski et al. | Jul 2002 | B2 |
6459706 | Hayasaka | Oct 2002 | B1 |
6625679 | Morrison et al. | Sep 2003 | B1 |
6665761 | Svenkeson et al. | Dec 2003 | B1 |
6691190 | Burroughs et al. | Feb 2004 | B1 |
6799317 | Heywood et al. | Sep 2004 | B1 |
6898703 | Ogami et al. | May 2005 | B1 |
6920516 | Hartwell et al. | Jul 2005 | B2 |
6931643 | Gardner | Aug 2005 | B2 |
7080205 | Demharter | Jul 2006 | B2 |
7240137 | Aguilar et al. | Jul 2007 | B2 |
7296271 | Chalmer et al. | Nov 2007 | B1 |
7350006 | Yasue et al. | Mar 2008 | B2 |
7433985 | Ayyar et al. | Oct 2008 | B2 |
7788669 | England et al. | Aug 2010 | B2 |
20030028696 | Catherwood et al. | Feb 2003 | A1 |
20030037244 | Goodman et al. | Feb 2003 | A1 |
20030055900 | Glas et al. | Mar 2003 | A1 |
20030126186 | Rodgers et al. | Jul 2003 | A1 |
20030126375 | Hill et al. | Jul 2003 | A1 |
20030126379 | Kaushik et al. | Jul 2003 | A1 |
20040117534 | Parry et al. | Jun 2004 | A1 |
20040122997 | Diamant | Jun 2004 | A1 |
20110271060 | Richardson et al. | Nov 2011 | A1 |
20110307641 | Ganguly | Dec 2011 | A1 |
20120260017 | Mine et al. | Oct 2012 | A1 |
Number | Date | Country |
---|---|---|
69221986 | Apr 1993 | DE |
431326 | Jun 1991 | EP |
0431326 | Jun 1991 | EP |
0538829 | Apr 1992 | EP |
0526930 | Feb 1993 | EP |
526930 | Feb 1993 | EP |
1 265 148 | Dec 2002 | EP |
03042762 | Feb 1991 | JP |
03212755 | Sep 1991 | JP |
06-12387 | Jan 1994 | JP |
20010027944 | Apr 2001 | KR |
WO 2009134217 | Nov 2009 | WO |
Entry |
---|
“NB9011428: Low-Synchronization Translation Lookaside Buffer Consistency Algorithm”, Nov. 1, 1990, IBM, IBM Technical Disclosure Bulletin, vol. 33, Iss. 6B, pp. 428-433. |
Moon-Seek Chang; Kern Koh, “Lazy TLB consistency for large-scale multiprocessors,” Parallel Algorithms/Architecture Synthesis, 1997. Proceedings., Second Aizu International Symposium , pp. 308-315, Mar. 17-21, 1997. |
van der Wal, A.J.; van Dijk, G.J.W., “Efficient interprocessor communication in a tightly-coupled homogeneous multiprocessor system,” Distributed Computing Systems, 1990. Proceedings., Second IEEE Workshop on Future Trends of , pp. 362-368, Sep. 30-Oct. 2, 1990. |
Alberi, J.L., “A Method of Interprocessor Communication for a Multiprocessor Environment,” Nuclear Science, IEEE Transactions on , vol. 29, No. 1, pp. 84-86, Feb. 1982. |
Silberschatz A. et al.; “Operating System Concepts”, Fourth Edition, 1994, pp. 23-35, XP 002328940, p. 31, lin 6-line 9. |
Katsuki D. et al.; “Pluribuspan operatoinal fault-tolerant Multiprocessor”, Proceedings of the IEEE USA, vol. 66, No. 10, Oct. 1978, pp. 1146-1159, XP002344146. |
Intel, Prescott New Instructions Software Developer's Guide, Jun. 2003. |
Intel—Mar. 2003—Intel Itanium Processor Family Interrupt Architecture Guide. |
Intel—Oct. 2002—Intel Itanium Architecture Software Developer's Manual—vol. 2: System Architecture Rev 2.1. |
GPTO, “OA Mailed Dec. 17, 2007 for German Patent Application 112004001418.2 53”, (Dec. 17, 2007), Whole Document. |
KPTO, “OA Mailed Feb. 29, 2008 for Korean Patent Application 10-2006-7002136”, (Feb. 29, 2008), Whole Document. |
KPTO, “OA Mailed May 31, 2007 for Korean Patent Application 10-2006-7002136”, (May 31, 2007), Whole Document. |
PTO, “IST and WO Mailed Sep. 29, 2005 for PCT/US2004/023570”, (Sep. 29, 2005), Whole Document. |
TW PTO, “OA Mailed Aug. 12, 2005 for TW Patent Application 93122148”, (Aug. 12, 2005), Whole Document. |
“German Patent Application 11 2004 001 418.2-53 Office Action Mailed Jan. 9, 2007”, (Jan. 9, 2007). |
“German Patent Application 112004001418.2-53 English Translation of OA Mailed Jan. 9, 2007”, (Mar. 5, 2007), Whole Document. |
Office action received for Japanese Patent Application No. 2006-521923, mailed on Feb. 24, 2009, 4 pages and English translation of 3 pages. |
Office Action from counterpart German Patent Application No. 11 2004 001418.2, mailed Jun. 29, 2011, 6 pages, English Summary included. |
Office Action from counterpart Chinese Patent Application No. 200810187079.8, mailed Jul. 11, 2011X, 18 pages, English Summary included. |
Office Action from counterpart Chinese Patent Application No. 200810187079.8, mailed Apr. 10, 2013, 7 pages, English Summary included. |
Office Action for Chinese Patent Application No. 200810187079.8, Mailed May 6, 2014, 10 pages. |
Office Action from Counterpart Chinese Patent Application No. 200810187079.8, Mailed Aug. 8, 2014, 2 pages. |
Number | Date | Country | |
---|---|---|---|
20050027914 A1 | Feb 2005 | US |