The present invention is generally related to a computer chip architecture having multiple processors on a single die. More particularly, the present invention is related to a multiprocessing chip utilizing multiple operating systems.
Existing internet data centers (IDCs) pack hundreds of processors (e.g., servers and the like) in a single building for processing a large volume of data transactions. Generally, the compute density or number of nodes per volume defines the efficiency of the IDC. The compute density effects the amortization of the high cost of the IDC infrastructure (e.g., networking, power, cooling, maintenance, reliability, and availability support). Typically, the greater the compute density, the better the IDC will be able to amortize the high cost of IDC infrastructure. Accordingly, a large compute density may be preferred. However, space may be unavailable or costly for locating a large number of processors necessary for maintaining a large compute density.
To provide increased compute density, multiprocessing schemes that utilize multiple processors have been developed. One conventional multiprocessing scheme (shown in
A second conventional multiprocessing scheme (shown in
Schemes that have placed multiple processors on a single chip typically utilize a single operating system for tying all the processors together. A well known limitation of this scheme and other multiprocessing schemes utilizing a single operating system is that an operating system does not scale well to large numbers of processors. That is, as the number of processors managed by a single operating system increases, the efficiency of the operating system goes down dramatically. For example, an operating system typically includes internal data structures that may be limited in the number of processors that can be supported, and limited bandwidth on a bus may slow transactions. Thus, scaling becomes impractical above some small number (e.g., currently about four to at most 64 processors, depending on the operating system in question).
Bugnion et al., in U.S. Pat. No. 6,075,938, discloses using a cache coherent non-uniform memory architecture (CC-NUMA) that supports multiple processors executing multiple operating systems. However, Bugnion et al. discloses multiple virtual processors, implemented in software on a single physical processor. This architecture fails to provide multiple physical processors, implemented in hardware, on a single die. Accordingly, this architecture suffers a performance penalty, because the single physical processor must task switch among multiple virtual processors (only one virtual processor can be running on the physical processor at any given time). Moreover, if this architecture were to support multiple physical processors, it would need space for providing multiple dies, and processing speed would consequently be sacrificed due to the input/output procedures needed to communicate among the multiple separate processors and the memory.
An aspect of the present invention is to provide a multiprocessing system including multiple processors mounted on a single die. The multiple processors are connected to a memory storing multiple operating systems. Each of the multiple processors may execute one of the multiple operating systems.
Another aspect of the present invention is to provide a multiprocessing system including a plurality of processor groups mounted on a single die. The processor groups are connected to a memory storing multiple operating systems. Each of the processor groups may execute one of the multiple operating systems. The processor group may include one or more processors mounted on the die.
Certain embodiments of the present invention are capable of achieving certain advantages, including some or all of the following: mounting multiple processors on a single die reduces the cabling problem inherent in connecting multiple processors on separate dies in separate housings; mounting multiple processors on a single die reduces the latency required for communication among the processors and improves the efficiency of message passing, potentially enabling a whole new class of applications (e.g., data mining) to run on such a multiprocessing system; mounting multiple processors on a single die reduces chip-to-chip communication costs and leads to further power efficiency; and increased scalability for multiprocessing.
Those skilled in the art will appreciate these and other advantages and benefits of various embodiments of the invention upon reading the following detailed description of a preferred embodiment with reference to the below-listed drawings.
The present invention is illustrated by way of example and not limitation in the accompanying figures in which like numeral references refer to like elements, and wherein:
In the following detailed description, numerous specific details are set forth in order to provide a thorough understanding of the present invention. However, it will be apparent to one of ordinary skill in the art that these specific details need not be used to practice the present invention. In other instances, well known structures, interfaces, and processes have not been shown in detail in order not to unnecessarily obscure the present invention.
Processors 305-320 may be configured, such that each processor executes its own operating system. Multiple processors (e.g., multiple processors in a processor group) may also be configured to execute a single operating system.
The operating systems shown in
It will be apparent to one of ordinary skill in the art that a system employing the principles of the present invention may be operable to support both processor groups and processors on a single die, such that each processor group and processor not within a processor group executes a distinct operating system.
Having each processor (or processor group) executing its own independent operating system minimizes scaling problems realized when utilizing a single operating system with multiple processors. For example, one hundred processors independently executing one hundred operating systems may be no more of a problem than one processor executing one operating system. The embodiments described above are operable to provide this type of scaling on a single chip.
While this invention has been described in conjunction with the specific embodiments thereof, it is evident that many alternatives, modifications and variations will be apparent to those skilled in the art. There are changes that may be made without departing from the spirit and scope of the invention.
Number | Name | Date | Kind |
---|---|---|---|
4709325 | Yajima | Nov 1987 | A |
5201040 | Wada et al. | Apr 1993 | A |
5301324 | Dewey et al. | Apr 1994 | A |
5446841 | Kitano et al. | Aug 1995 | A |
5513346 | Satagopan et al. | Apr 1996 | A |
6075938 | Bugnion et al. | Jun 2000 | A |
6108731 | Suzuki et al. | Aug 2000 | A |
6314501 | Gulick et al. | Nov 2001 | B1 |
6526462 | Elabd | Feb 2003 | B1 |
6658591 | Arndt | Dec 2003 | B1 |
6772241 | George et al. | Aug 2004 | B1 |
Number | Date | Country | |
---|---|---|---|
20020184328 A1 | Dec 2002 | US |