The present invention generally relates to logical partitioning in a (e.g., symmetric) multi-core processor (MCP). Specifically, the present invention allows virtualized control threads to traverse physical partitions in a MCP to control logical partitions of sub-processing elements (SPEs).
Low utilization of Multi-Core Processors (MCPs) has been a major drawback of symmetric MCPs. Also, design inflexibility forces continuous leakage current in the unloaded and stand-by sub-elements, such as Sub-Processing Element (SPE), so that the power is wasted. For example, in a symmetric MCP, there can be a Main Processing Element (MPE) and 8 SPEs. In many cases, only a portion of SPEs are utilized and the overall MCP utilization is usually low. Such stand-by SPEs consume high levels of power and continuously leak. Typically, a MCP is used for the high performance digital processor scaling, but due to the complexity of the MCP design, the utilization and the efficiency of the software become challenging to optimize as the MCP dimension increases.
This disclosure describes an apparatus, computer architecture, method, operating system, compiler, and application program products for MPEs as well as virtualization across physical boundaries that define physical partitions in a symmetric MCP. Among other things, the disclosure is applied to a generic microprocessor architecture with a set (e.g., one or more) of controlling/main processing elements (e.g., MPEs) and a set of groups of sub-processing elements (e.g., SPEs). Under this arrangement, MPEs and SPEs are organized in a way that a smaller number MPEs control the behavior of a group of SPEs using program code embodied as a set of virtualized control threads. The apparatus includes a MCP coupled to a power supply coupled with cores to provide a supply voltage to each core (or core group) and controlling-digital elements and multiple instances of sub-processing elements. The MPC is partitioned physically to reduce hardware and compiler design complexity in regular formation, such as a MPE controls M*SPEs, and there are N groups of MPE+M*SPEs in the MCP. As such, virtualized control threads can traverse the physical boundaries of the MCP to control SPE(s) (e.g., logical partitions having one or more SPEs) in a different physical partition (e.g., different from the physical partition from which the virtualized control thread originated.
A first aspect of the present invention provides a multi-core processor, comprising: a first physical partition comprising a first main processing element, and a first logical partition of sub processing elements; a second physical partition comprising a second main processing element; and a first virtualized control thread associating the second main processing element of the second physical partition with the sub-processing elements of first logical partition.
A second aspect of the present invention provides a multi-core processor, comprising: a first physical partition comprising a first main processing element, and a first group of sub processing elements; a second physical partition comprising a second main processing element and a second group of sub-processing elements; a logical partition comprising a third group sub-processing elements; and a virtualized control thread associating the second main processing element of the second physical partition with the first logical partition.
A third aspect of the present invention provides a processing method, comprising: associating a main processing element of a first physical partition with a logical partition comprising sub-processing elements using a virtualized control thread; and controlling the group of sub-processing elements using the virtualized control thread.
A fourth aspect of the present invention provides a method for deploying a processing system, comprising: providing a multi-core processor, comprising: a first physical partition comprising a first main processing element, and a first logical partition of sub processing elements; a second physical partition comprising a second main processing element; and a first virtualized control thread associating the second main processing element of the second physical partition with the sub-processing elements of first logical partition.
A fifth aspect of the present invention provides a computer-implemented business method, comprising: associating a main processing element of a first physical partition with a logical partition comprising sub-processing elements using a virtualized control thread; and controlling the group of sub-processing elements using the virtualized control thread.
These and other features of this invention will be more readily understood from the following detailed description of the various aspects of the invention taken in conjunction with the accompanying drawings in which:
The drawings are not necessarily to scale. The drawings are merely schematic representations, not intended to portray specific parameters of the invention. The drawings are intended to depict only typical embodiments of the invention, and therefore should not be considered as limiting the scope of the invention. In the drawings, like numbering represents like elements.
This disclosure describes an apparatus, computer architecture, method, operating system, compiler, and application program products for MPEs as well as virtualization across physical boundaries that define physical partitions in a symmetric MCP. Among other things, the disclosure is applied to a generic microprocessor architecture with a set (e.g., one or more) of controlling/main processing elements (e.g., MPEs) and a set of groups of sub-processing elements (e.g., SPEs). Under this arrangement, MPEs and SPEs are organized in a way that a smaller number MPEs control the behavior of a group of SPEs using program code embodied as a set of virtualized control threads. The apparatus includes a MCP coupled to a power supply coupled with cores to provide a supply voltage to each core (or core group) and controlling-digital elements and multiple instances of sub-processing elements. The MPC is partitioned physically to reduce hardware and compiler design complexity in regular formation, such as a MPE controls M*SPEs, and there are N groups of MPE+M*SPEs in the MCP. As such, virtualized control threads can traverse the physical boundaries of the MCP to control SPE(s) (e.g., logical partitions having one or more SPEs) in a different physical partition (e.g., different from the physical partition from which the virtualized control thread originated.
A conventional design uses each MPE (VMPE,1˜VMPE,n) that controls portions of SPEs as a group (GSPE,1˜GSPE,n). The MPE activity is interpreted as a virtualization VMPE,n, and the assigned SPEs are considered as the first group GSPE,n in each partition. By adopting the above method, it allows the following:
A MPE virtualizes (V) a thread;
Free SPEs across the physical partitions are accounted;
Requested SPEs are assigned to the Vk; and/or
Vk controls SPEs within logical partition.
Under related art systems, such as that shown in
To address these issues, a configuration such as that shown in
In additional, virtualized control threads V1-VN will be generated by and/or embodied within MPEs 24A-N. Such threads V1-VN will control groups G1-GN of SPEs 26A-N and/or logical groups L1-LN of SPEs 28A-N. Along these lines, virtualized control threads can work within its own physical partition such as V2 is shown doing, or they can transcend/traverse the physical barriers of physical partitions such as V1 and VN are shown doing. In
In controlling groups of logical partitions of SPEs, virtualized control threads can perform many functions. These include, but are not limited to: the virtualized control threads being configured to send program code and data to the sub-processing elements; the virtualized control thread being further configured to collect computation results from the sub-processing elements; the virtualized control threads controlling a clock speed, power consumption and computation loading of the sub-processing elements. Regardless, MPEs 24A-N can log all events, information exchanges, controls, etc., taking place. These and other details are further shown below:
All log(s) are stored and made accessible to all MPEs. The computation-loaded SPEs and unloaded/free SPEs are accounted with the log. The virtualization is initiated by either software or hardware request. An Operating System (OS) is designed to enable multiple independent threads with virtualization. Computer programming languages and the machine code compilers support the virtualization with libraries. Upon request, the MPEs will request a number of SPEs (out of the free/unused SPEs) to form a group (GN). The group efficiency is optimized by maximizing the bus allocation and connectivity. The MPE runs all the virtualization V0 . . . VN with time-division resource sharing and OS-level preemptive task management allows multi-tasking and virtualization. If the requested number of SPEs are greater than the available free SPEs in the MPE's physical partition, the nearest free SPEs in the nearby physical partitions are summoned to join the logical group. A group of SPEs (G) is assigned to the virtualized control thread (V). The MPE maintains different virtualized control threads over the SPEs, as shown in the diagram. The virtualization thread V loads executable program codes and data to the SPE group G, and monitors the progress and controls the performance and power consumption of G. When the G produces computation results, the V mitigates the results to the MPE, allowing further computations or external I/O.
The number of total virtualized control threads is limited by the MPE capacity to hold virtualization, and the number of available SPEs. Further requests for virtualization can be denied, or accepted by sharing time-division thread sharing. The virtualized control threads (V) in the MPEs control the power supply voltage and clock frequency to each SPE group (Gk), so that the active and stand-by currents are optimized for the required computations. Usually a digital circuit speed increases when the supply voltage is raised, and the clock speed can be increased. Within the allowed supply voltage range, it is adjusted based on the computation requirements.
When the loaded computation requires intensive operation, the voltage and clock frequency are raised so that it is completed within the time frame. When the loaded computation is loose and if there is plenty of time for processing, the voltage and the clock frequency are lowered to maximize performance/power ratio. An extreme case is when SPEs are standing by. The supply voltage can be reduced to 0 voltage to put the SPE is sleep mode. Slightly higher voltages can be used with a trade-off between leakage current and wake up time. It takes more time to wake an element when it is in a deeper sleep mode. Among others, some benefits of this approach are: (1) the MCP utilization and computation capacity is maximized and (2) the power efficiency of the MCP is optimized. Overall, the MCP improves performance/power ratio to enable greener computing.
It should be understood that the present invention could be deployed on one or more computing devices (e.g., servers, clients, etc.) within a computer infrastructure. This is intended to demonstrate, among other things, that the present invention could be implemented within a network environment (e.g., the Internet, a wide area network (WAN), a local area network (LAN), a virtual private network (VPN), etc.), or on a stand-alone computer system. In the case of the former, communication throughout the network can occur via any combination of various types of communications links. For example, the communication links can comprise addressable connections that may utilize any combination of wired and/or wireless transmission methods. Where communications occur via the Internet, connectivity could be provided by conventional TCP/IP sockets-based protocol, and an Internet service provider could be used to establish connectivity to the Internet. Still yet, the computer infrastructure is intended to demonstrate that some or all of the components of such an implementation could be deployed, managed, serviced, etc., by a service provider who offers to implement, deploy, and/or perform the functions of the present invention for others.
Where computer hardware is provided, it is understood that any computers utilized will include standard elements such as a processing unit, a memory medium, a bus, and input/output (I/O) interfaces. Further, such computer systems can be in communication with external I/O devices/resources. In general, processing units execute computer program code, such as the functionality described above (e.g., all libraries discussed herein), which is stored within memory medium(s). While executing computer program code, the processing unit can read and/or write data to/from memory, I/O interfaces, etc. The bus provides a communication link between each of the components in a computer. External devices can comprise any device (e.g., keyboard, pointing device, display, etc.) that enable a user to interact with the computer system and/or any devices (e.g., network card, modem, etc.) that enable the computer to communicate with one or more other computing devices.
The hardware used to implement the present invention can comprise any specific purpose computing article of manufacture comprising hardware and/or computer program code for performing specific functions, any computing article of manufacture that comprises a combination of specific purpose and general purpose hardware/software, or the like. In each case, the program code and hardware can be created using standard programming and engineering techniques, respectively. Moreover, the processing unit therein may comprise a single processing unit, or be distributed across one or more processing units in one or more locations, e.g., on a client and server. Similarly, the memory medium can comprise any combination of various types of data storage and/or transmission media that reside at one or more physical locations. Further, the I/O interfaces can comprise any system for exchanging information with one or more external device. Still further, it is understood that one or more additional components (e.g., system software, math co-processing unit, etc.) can be included in the hardware.
While shown and described herein as virtualization in a multi-core processor, it is understood that the invention further provides various alternative embodiments. For example, in one embodiment, the invention provides a computer-readable/useable medium that includes computer program code to enable a computer infrastructure to provide visualization in a multi-core processor. To this extent, the computer-readable/useable medium includes program code that implements the process(es) of the invention. It is understood that the terms computer-readable medium or computer useable medium comprises one or more of any type of physical embodiment of the program code. In particular, the computer-readable/useable medium can comprise program code embodied on one or more portable storage articles of manufacture (e.g., a compact disc, a magnetic disk, a tape, etc.), on one or more data storage portions of a computing device (e.g., a fixed disk, a read-only memory, a random access memory, a cache memory, etc.), and/or as a data signal (e.g., a propagated signal) traveling over a network (e.g., during a wired/wireless electronic distribution of the program code).
In another embodiment, the invention provides a method (e.g., business) that performs the process of the invention on a subscription, advertising, and/or fee basis. That is, a service provider, such as a Solution Integrator, could offer to provide virtualization in a multi-core processor. In this case, the service provider can create, maintain, support, etc., a computer infrastructure, such as computer infrastructure that performs the process of the invention for one or more customers. In return, the service provider can receive payment from the customer(s) under a subscription and/or fee agreement and/or the service provider can receive payment from the sale of advertising content to one or more third parties.
In still another embodiment, the invention provides a processing method. In this case, a computer infrastructure can be provided and one or more systems for performing the process of the invention can be obtained (e.g., created, purchased, used, modified, etc.) and deployed to the computer infrastructure. To this extent, the deployment of a system can comprise one or more of: (1) installing program code on a computing device from a computer-readable medium; (2) adding one or more computing devices to the computer infrastructure; and (3) incorporating and/or modifying one or more existing systems of the computer infrastructure to enable the computer infrastructure to perform the process of the invention.
As used herein, it is understood that the terms “program code” and “computer program code” are synonymous and mean any expression, in any language, code or notation, of a set of instructions intended to cause a computing device having an information processing capability to perform a particular function either directly or after either or both of the following: (a) conversion to another language, code or notation; and/or (b) reproduction in a different material form. To this extent, program code can be embodied as one or more of: an application/software program, component software/a library of functions, an operating system, a basic I/O system/driver for a particular computing and/or I/O device, and the like.
A data processing system suitable for storing and/or executing program code can be provided hereunder and can include at least one processor communicatively coupled, directly or indirectly, to memory element(s) through a system bus. The memory elements can include, but are not limited to, local memory employed during actual execution of the program code, bulk storage, and cache memories that provide temporary storage of at least some program code in order to reduce the number of times code must be retrieved from bulk storage during execution. Input/output or I/O devices (including, but not limited to, keyboards, displays, pointing devices, etc.) can be coupled to the system either directly or through intervening I/O controllers.
Network adapters also may be coupled to the system to enable the data processing system to become coupled to other data processing systems, remote printers, storage devices, and/or the like, through any combination of intervening private or public networks. Illustrative network adapters include, but are not limited to, modems, cable modems and Ethernet cards.
The foregoing description of various aspects of the invention has been presented for purposes of illustration and description. It is not intended to be exhaustive or to limit the invention to the precise form disclosed, and obviously, many modifications and variations are possible. Such modifications and variations that may be apparent to a person skilled in the art are intended to be included within the scope of the invention as defined by the accompanying claims.
This application is a continuation of, and claims the benefit of, co-pending and co-owned U.S. patent application Ser. No. 12/241,429, filed Sep. 30, 2008, the entire contents of which are herein incorporated by reference. The present invention is related in some aspects to co-pending and commonly owned application entitled Virtualization in a Multi-Core Processor (MCP), which was filed on Sep. 11, 2008, and assigned application Ser. No. 12/208,651, the entire contents of which are herein incorporated by reference. The present application is also related in some aspects to co-pending and commonly owned application entitled Delegated Virtualization in a Multi-Core Processor (MCP), which was filed on Sep. 30, 2008, and assigned application Ser. No. 12/241,332, the entire contents of which are herein incorporated by reference.
Number | Name | Date | Kind |
---|---|---|---|
5201040 | Wada et al. | Apr 1993 | A |
5475856 | Kogge | Dec 1995 | A |
5574770 | Yoo et al. | Nov 1996 | A |
5640584 | Kandasamy et al. | Jun 1997 | A |
5721883 | Katsuo et al. | Feb 1998 | A |
5881303 | Hagersten et al. | Mar 1999 | A |
6067603 | Carpenter et al. | May 2000 | A |
6215898 | Woodfill et al. | Apr 2001 | B1 |
6240090 | Enhager | May 2001 | B1 |
6332180 | Kauffman et al. | Dec 2001 | B1 |
6404902 | Takano et al. | Jun 2002 | B1 |
6456737 | Woodfill et al. | Sep 2002 | B1 |
6553401 | Carter et al. | Apr 2003 | B1 |
6567622 | Phillips | May 2003 | B2 |
6661931 | Kawada | Dec 2003 | B1 |
6662268 | McBrearty et al. | Dec 2003 | B1 |
6744931 | Komiya et al. | Jun 2004 | B2 |
6820217 | Mock et al. | Nov 2004 | B2 |
6829378 | DiFilippo et al. | Dec 2004 | B2 |
6877158 | Arndt | Apr 2005 | B1 |
6886048 | Richard et al. | Apr 2005 | B2 |
6922783 | Knee et al. | Jul 2005 | B2 |
6957435 | Armstrong et al. | Oct 2005 | B2 |
7028196 | Soltis, Jr. et al. | Apr 2006 | B2 |
7080267 | Gary et al. | Jul 2006 | B2 |
7095882 | Akahori | Aug 2006 | B2 |
7102777 | Haraguchi | Sep 2006 | B2 |
7142725 | Komiya et al. | Nov 2006 | B2 |
7174550 | Brice, Jr. et al. | Feb 2007 | B2 |
7409570 | Suzuoki | Aug 2008 | B2 |
7418368 | Kim et al. | Aug 2008 | B2 |
7461272 | Rotem et al. | Dec 2008 | B2 |
7500204 | Pineda et al. | Mar 2009 | B2 |
7633955 | Saraiya | Dec 2009 | B1 |
7680972 | Inoue et al. | Mar 2010 | B2 |
7730456 | Okawa et al. | Jun 2010 | B2 |
7743375 | Goodman et al. | Jun 2010 | B2 |
7849347 | Armstrong et al. | Dec 2010 | B2 |
7911971 | Belchter et al. | Mar 2011 | B2 |
8019962 | Armstrong et al. | Sep 2011 | B2 |
8327353 | Traut | Dec 2012 | B2 |
20020138637 | Suzuoki et al. | Sep 2002 | A1 |
20030084030 | Day et al. | May 2003 | A1 |
20030105799 | Khan et al. | Jun 2003 | A1 |
20040054996 | Srinivas et al. | Mar 2004 | A1 |
20040111596 | Rawson, III | Jun 2004 | A1 |
20040215926 | Arimilli et al. | Oct 2004 | A1 |
20050034002 | Flautner | Feb 2005 | A1 |
20050044301 | Vasilevsky et al. | Feb 2005 | A1 |
20050071578 | Day et al. | Mar 2005 | A1 |
20050083338 | Yun et al. | Apr 2005 | A1 |
20050136076 | Pizza et al. | Jun 2005 | A1 |
20050188373 | Inoue et al. | Aug 2005 | A1 |
20050262370 | Tsui et al. | Nov 2005 | A1 |
20050263678 | Arakawa | Dec 2005 | A1 |
20050283679 | Heller et al. | Dec 2005 | A1 |
20050289365 | Bhandarkar | Dec 2005 | A1 |
20060013473 | Woodfill et al. | Jan 2006 | A1 |
20060020944 | King et al. | Jan 2006 | A1 |
20060069936 | Lint et al. | Mar 2006 | A1 |
20060123368 | De Gyvez et al. | Jun 2006 | A1 |
20060130062 | Burdick et al. | Jun 2006 | A1 |
20060184923 | Pires Dos Reis Moreira et al. | Aug 2006 | A1 |
20060250514 | Inoue et al. | Nov 2006 | A1 |
20060268357 | Vook et al. | Nov 2006 | A1 |
20070011667 | Subbiah et al. | Jan 2007 | A1 |
20070050764 | Traut | Mar 2007 | A1 |
20070074011 | Borkar et al. | Mar 2007 | A1 |
20070074207 | Bates et al. | Mar 2007 | A1 |
20070143738 | Baker et al. | Jun 2007 | A1 |
20070159642 | Choi | Jul 2007 | A1 |
20070220517 | Lippett | Sep 2007 | A1 |
20080077815 | Kanakogi | Mar 2008 | A1 |
20080082844 | Ghiasi et al. | Apr 2008 | A1 |
20080109811 | Krauthgamer et al. | May 2008 | A1 |
20080126750 | Sistla | May 2008 | A1 |
20080134321 | Rajagopal et al. | Jun 2008 | A1 |
20080162983 | Baba et al. | Jul 2008 | A1 |
20080163206 | Nair | Jul 2008 | A1 |
20080229127 | Felter et al. | Sep 2008 | A1 |
20080244222 | Supalov et al. | Oct 2008 | A1 |
20080244294 | Allarey | Oct 2008 | A1 |
20090006808 | Blumrich et al. | Jan 2009 | A1 |
20090049317 | Gara et al. | Feb 2009 | A1 |
20090083460 | Mitra et al. | Mar 2009 | A1 |
20090138737 | Kim et al. | May 2009 | A1 |
20090172690 | Zimmer et al. | Jul 2009 | A1 |
20090182877 | McDermott et al. | Jul 2009 | A1 |
20100031325 | Maigne et al. | Feb 2010 | A1 |
20100082951 | Bates et al. | Apr 2010 | A1 |
20100131740 | Yokote et al. | May 2010 | A1 |
Number | Date | Country |
---|---|---|
2008012159 | Jan 2008 | WO |
Entry |
---|
J. Fliche et al., “On the Potential of NoC Virtualization for Multicore Chips”, International Conference on Complex, Intelligent and Software Intensive Systems, Copyright 2008 IEEE, pp. 801-807. |
Gschwind, Michael, “The Cell Broadband Engine: Exploring Multiple Levels of Parallelism in a Chip Multiprocessor”, International Journal of Parallel Programming, Kluwer Academic Publisher—Plenum Publishers, NE, vol. 35, No. 3, pp. 233-262, Apr. 6, 2007. |
International Search Report and the Written Opinion of the International Searching Authority for END080354, International Application No. PCT/EP2009/062557, mailed Jan. 25, 2010. |
Daniel H. Pan, USPTO Office Action, U.S. Appl. No. 12/241,429, Mail Date Nov. 18, 2011, 17 pages. |
Aimee J. Li, USPTO Final Office Action, U.S. Appl. No. 12/241,429, Mail Date Apr. 10, 2012, 12 pages. |
Aimee J. Li, USPTO Office Action, U.S. Appl. No. 12/241,429, Mail Date Jun. 20, 2013, 18 pages. |
Aimee J. Li, USPTO Ex parte Quayle Office Action, U.S. Appl. No. 12/241,429, Mail Date Nov. 6, 2013, 21 pages. |
Aimee J. Li, USPTO Notice of Allowance and Fee(s) Due, U.S. Appl. No. 12/241,429, Mail Date Jan. 8, 2014, 14 pages. |
Aurel Prifti, USPTO Office Action, U.S. Appl. No. 12/208,651, Mail Date Feb. 28, 2011, 24 pages. |
Aurel Prifti, USPTO Final Office Action, U.S. Appl. No. 12/208,651, Mail Date May 4, 2011, 7 pages. |
Aurel Prifti, USPTO Office Action, U.S. Appl. No. 12/208,651, Mail Date Nov. 23, 2011, 11 pages. |
Aurel Prifti, USPTO Notice of Allowance and Fee(s) Due, U.S. Appl. No. 12/208,651, Mail Date May 1, 2012, 7 pages. |
Aurel Prifti, USPTO Office Action, U.S. Appl. No. 12/241,332, Mail Date Sep. 28, 2011, 28 pages. |
Aurel Prifti, USPTO Office Action, U.S. Appl. No. 12/241,332, Mail Date Mar. 27, 2012, 16 pages. |
Aurel Prifti, USPTO Final Office Action, U.S. Appl. No. 12/241,332, Mail Date Oct. 18, 2012, 15 pages. |
Aurel Prifti, USPTO Notice of Allowance and Fee(s) Due, U.S. Appl. No. 12/241,332, Date Mailed Jan. 3, 2013, 7 pages. |
Prifti, U.S. Appl. No. 13/563,160, Final Office Action dated Apr. 30, 2013, 26 pages. |
Prifti, U.S. Appl. No. 13/563,160, Office Action dated Aug. 13, 2013, 32 pages. |
Prifti, U.S. Appl. No. 13/563,160, Notice of Allowance dated Feb. 28, 2014, 25 pages. |
Prifti, U.S. Appl. No. 12/241,697, Notice of Allowance dated Aug. 16, 2012, 6 pages. |
Number | Date | Country | |
---|---|---|---|
20140259013 A1 | Sep 2014 | US |
Number | Date | Country | |
---|---|---|---|
Parent | 12241429 | Sep 2008 | US |
Child | 14281062 | US |