The invention relates to computing systems and, more particularly, to forming partitions within a multiprocessing computing environment.
Computers are becoming increasingly more complex, and sophisticated computers may include multiple processors, as well as, multiple memory units. These multiprocessing: computers may further connect together to form a large mainframe computer, where each multiprocessing computer may share its processing power and memory with one or more of the other multiprocessing computers via communication interfaces. In the mainframe context, each of these multiprocessing computers is typically referred to as a “cell,” and the cell represents the basic execution building block of the mainframe.
When the administrator logically associates two or more cells to define a single execution environment, the combination of these cells is referred to as a “partition” on which an instance of an operating system is run. It is generally desirable to combine the processing power and memory of multiple cells into a partition to perform complicated tasks that a single cell alone could not complete within a reasonable amount of time. Traditional mainframe computers include a single, system-level processor or service processor that is solely responsible for partitioning the cells and managing the partitions. Further, a typical mainframe computer allows for multiple partitions to be specified, and each partition can combine varying numbers of cells to tailor their combined processing power to the task at hand.
While mainframe computers offer significant processing power and enable task specific assignment of processing power by way of partitions, generally mainframe computers are not readily scalable. That is, the logical addition and removal of cells to and from partitions within the mainframe computer can be a manual, time consuming process that requires significant knowledge of the existing cells and partitions. For example, an administrator may have to keep track of partition identifiers that identify each partition, cell identifiers that identify each cell, which cells belong to which partitions or more likely which cell identifiers belong to which partition identifiers, as well as, higher level concerns, such as which cells maintain which resources and which partitions require additional cell resources to achieve a certain partition configuration. Thus, from an administrative perspective, conventional mainframe computers are not readily scalable.
In general, the invention is directed to techniques for performing decentralized hardware partitioning within a multiprocessing computing system. Decentralized hardware partitioning can be achieved by allocating one service processor per cell and allowing those cells to independently control their partitioning. By each cell internally determining its own unique partition identifier, all cells can exist independent from one another, but at the same time, work together with other cells that have an internally determined partition identifier having common attributes to automatically form a multi-cell partition in a decentralized manner. Thus, each cell holds its own view of the logical partitioning of the multiprocessing computer system, and the cells utilize their partition identifiers to communicate with other cells and create partitions without requiring system-level control.
As one example, a multiprocessing computing system includes at least two (i.e., first and second) independent computing cells, where each of the first and second cells includes a service processor that negotiates with the service processor of the other cell to control the logical inclusion of the respective cell within one or more partitions. Because the cells themselves automatically perform the partitioning process, the multiprocessing system may achieve increased scalability. For example, the administrator need not manually keep track of the multitude of partition identifiers, cell identifiers, and associations between these identifiers. The administrator therefore may focus on the more important issues of cell maintenance, partition performance, and other higher level concerns. Moreover, the ability of the cells to perform this “decentralized hardware partitioning” technique, as it is referred to herein, may reduce the overall cost of the system because a dedicated, system-level processor may no longer be required to perform partitioning. Instead, less expensive baseboard management controllers (BMCs) can perform the management and partitioning duties of a system-level service processor.
Continuing the example, the first cell includes a first processor that internally calculates a partition identifier. The partition identifier is a logical construct that uniquely identifies a partition to which the cell belongs. As an example, the partition identifier may comprise a bit field based upon a cell identifier by which each cell within the multiprocessing computing system is uniquely identified. A partition identifier comprising a two-bit, binary bit field of “11,” for example, indicates that cells identified by cell identifiers “0” and “1” fall within the partition in accordance with conventional bitmask techniques. Using this partition calculation, the first and second cells need only receive a selection of which cells belong to a particular partition, and the cells themselves can calculate the partition identifier based on the received selection information. In this example, the partition identifier is said to be “3”.
Once calculated, the first processor can determine whether the partition includes the second cell. Referring to the above example, the first processor could determine that the second processor, identified by cell identifier “1,” belongs to its partition identified by partition identifier “11” through conventional bitmask operations. Based on this analysis, the first processor establishes the partition within the multiprocessing computing system. The partition may include only the first cell or both the first and second cell, as in the above example. If the partition includes only the first cell, the first cell executes a single operating system constrained to the single cell. If the established partition successfully includes both the first and second cells, both the first and second cells cooperate to execute the single operating system across both cells. Thus, a multiprocessing computing system can form a partition in accordance with the decentralized hardware partitioning techniques with minimal administrative oversight and in a manner that possibly reduces the overall cost of the system.
In one embodiment, a method of partitioning a multiprocessing computing system having a plurality of independent computing cells, the method comprises calculating a respective partition identifier with each of the computing cells of the multiprocessing computing system, wherein the partition identifier uniquely identifies a partition to which the corresponding cell belongs. The method further comprises reconfiguring the multiprocessing system to establish one or more partitions within the multiprocessing computing system based on the partition identifiers calculated by the computing cells and executing a respective instance of an operating system across each of the partitions, wherein each partition comprises a logical association of one or more of the plurality of cells to define a single execution environment.
In another embodiment, a multiprocessing computing system comprises a plurality of computing cells that each calculates a respective partition identifier, wherein the partition identifier uniquely identifies a partition to which the corresponding cell belongs. The plurality of cells further collectively reconfigure the multiprocessing system to establish one or more partitions within the multiprocessing computing system based on the respective partition identifiers and collectively execute a respective instance of an operating system across each of the partitions, wherein each partition comprises a logical association of one or more of the plurality of cells to define a single execution environment.
In another embodiment, a multiprocessing computing system comprising a plurality of computing means for calculating a respective partition identifier, wherein the partition identifier uniquely identifies a partition to which the corresponding computing means belongs. The plurality of computing means are further for reconfiguring the multiprocessing system to establish one or more partitions within the multiprocessing computing system based on the respective partition identifiers, and executing a respective instance of an operating system across each of the partitions, wherein each partition comprises a logical association of one or more of the plurality of computing means to define a single execution environment.
The details of one or more embodiments of the invention are set forth in the accompanying drawings and the description below. Other features, objects, and advantages of the invention will be apparent from the description and drawings, and from the claims.
Cells 14 may be substantially similar to one another and may comprise substantially similar components. For example, each of cells 14 include respective input/output interfaces 16A-16N (“I/O interfaces 16” in
When two or more cells 14 are logically associated to combine computing power and memory storage space to form a single execution environment on which an instance of an operating system can be run, the two or more cells 14 are said to operate within a “partition.” A partition is typically the next highest building block above a cell in a multiprocessing computing environment, such as exemplary cellular multiprocessing computing system 12. A partition, however, does not require two or more of cells 14 and may be formed upon a single one of cells 14 that is configured to operate independently from the other cells 14. That is, a partition, as it relates to the decentralized hardware partitioning techniques described herein and unlike conventional notions of partitions, may be formed from a single one of cells 14 that operates independently of the other cells 14 or two or more of cells 14 that are logically associated to combine resources. Thus, in an example embodiment where multiprocessing computing system 12 comprises 32 cells 14, 32 partitions may be formed by those 32 cells 14.
Cells 14 may communicate with one another via respective I/O interfaces 16. That is, I/O interfaces 16 of cells 14 enable communication with other I/O interfaces 16 of other cells and/or a network. I/O interfaces 16 may be dedicated to general purpose communication tasks, unrelated to the monitoring and low-level configuration of cells 14 described below. I/O interface 16 may operate according to standard communication protocols and provide the physical interface to other I/O interfaces 16 and/or the network over which information may be conveyed. For example, I/O interfaces 16 may each comprise a standard Ethernet interface and associated Ethernet protocols to enable multicast communications across a common Ethernet interconnect. Alternatively, or in conjunction with the above exemplary Ethernet interconnect, I/O interfaces 16 may comprise interfaces for communicating with a network, such as a packet-based network, and associated protocols, such as transmission control protocol/internet protocol (TCP/IP), for sending and receiving data across the network. In either case, I/O interface 16 may provide a wired or wireless connection between cells 14 and/or a network, where a wireless connection may occur according to one of the Institute of Electrical and Electronics Engineers 802.11 standards, Bluetooth™, or any other wireless communication standard. For ease of illustration purposes, the interconnectivity between I/O interfaces 16 is not shown, although any communicative mediums corresponding to the above listed networks may be employed.
Each of cells 14 further includes a processor cluster 18 that communicate across a cell interconnect. Processor clusters 18 may include any number of processors coupled together in a cluster formation so as to concurrently execute operations. When multiple cells 14 are logically associated as part of the same partition, processor clusters 18 of each of the cells, as described below, provide resources for execution of a single instance of an operating system (not shown in
BMCs 22 generally manage the interface between the system management software and the platform hardware. For example, BMCs 22 may receive reports from sensors (not shown in
As shown in
I/O interfaces 26 generally provide an interface to interconnect 33. Interconnect 33 may comprise an Ethernet interconnect that conforms to one of the various Institute of Electrical and Electronics Engineers (IEEE) 802.3 standards. The Ethernet interconnect may take many different forms or topologies, including a star topology and a bus topology. Via either of these topologies or any other topology, interconnect 33 may couple each of BMCs 22 of cells 14 together. For ease of illustration, supporting modules or mechanisms necessary to implement interconnect 33 are not shown in
BMCs 22 may also include respective memories 28A-28N (“memories 28”) that may store these reports and alerts locally. Both memories 20 and 28 may comprise any volatile memory such as random access memory (RAM), dynamic RAM (DRAM), static RAM (SRAM), or non-volatile memory such as, a magnetic data storage device or optical data storage device, FLASH memory, or electrically erasable programmable read only memory (EEPROM).
While the above generally describes the functionality of BMCs 22, BMCs 22 further perform functions specific to the decentralized hardware partitioning technique described herein. In particular, each of BMCs 22 may receive partition selection information 30A-30N (“selection information 30” in
Administrator 10 may select which of cells 14 belong in a particular partition via user interface 32. In some embodiments, user interface 32 may present a graphical user interface whereby administrator 10, for example, merely “drags and drops” icons representing cells into configurable areas of a screen that each represent a different partition. In these embodiments, administrator 10 need not even know of or contemplate lower-level identifiers or other configuration data, such as the below described cell and partition identifiers. Alternatively, in other embodiments, selection information 30 may be preprogrammed by administrator 10 or other multiprocessing technical specialists such that a particular partitioning scheme is defined. In yet other embodiments, each of cells 14 may default into their own partition such that no two of cells 14 combine in a single partition. Typically, cells 14 implement this default partitioning scheme in the absence of a preprogrammed or administrator selected scheme. Regardless of which embodiment is implemented, administrator 10 need not know of or manually maintain the underlying partition and/or cell identifiers.
After BMCs 22 store respective selection information 30, each of maintenance processors 24 respectively and internally calculate partition identifiers 36A-36N (“partition IDs 36” in
The bit field may be employed, in some embodiments, in a manner similar to that of conventional bitmask operations. Alternatively, the bit field may be employed in a manner that differs substantially from that of conventional bitmask operations. For example, the bit field described herein, while representative of which of cells 14 belong to a given partition, may be summed to generate a partition ID. The partition ID is compared with other partition IDs generated by other cells 14 to determine whether to form a partition. These other partition IDs are each calculated internally as described above and communicated to each of maintenance processors 24 in the manner described below to facilitate the comparison.
The bit field generated after analyzing selection information 30 is stored to respective memories 28 as partition ID 36. In some embodiments, partition IDs 36 further indicate within the bit field a master cell for each partition. A designated master cell of cells 14 is generally responsible for initiating the process by which a particular partition is formed.
After storing partition IDs 36, maintenance processors 24 analyze their respective partition IDs 36 to determine whether the partition includes other of cells 14. Based on this determination, one or more processors within processor clusters 18 establish the partition within the cellular multiprocessing computing system. That is, one or more of cells 14 then exchange respective partition IDs 36 (or one of the cells designated as master communicates the partition ID it calculated), allowing those one or more cells 14 to determine that they share a common partition. Upon determining that they chare a common partition ID, those one or more cells 14 execute a single operating system across the established partition. In the event the established partition successfully includes two or more of cells 14, these two or more of cells 14 cooperate to execute a common instance of an operating system across the partition. That is, each of the two or more cells 14 within the partition may execute a portion of the single operating system or share execution of different tasks pertaining to the single operating system.
In this manner, cells 14 within cellular multiprocessing computing system 12 perform decentralized hardware partitioning to increase the scalability of cellular multiprocessing computing system 12 while possibly reducing the overall cost of system 12. Because the cells themselves perform or automate at least some of the partitioning process, administrator 10 need not manually keep track of or be concerned with the multitude of partition identifiers, cell identifiers and their associations. Thus, administrator 10 may more readily scale system 12 to increased partitioning due to the less complexity of administering the partitions. Moreover, the ability of cells 14 to perform this decentralized hardware partitioning technique may reduce the overall cost of the system because a dedicated processor may no longer be required to perform partitioning.
As shown in
Further, cells 14A, 14B may contain other components that are substantially the same, such as basic input/output systems 42A, 42B (“BIOSs 42A, 42B” in
As described above, an administrator, such as administrator 10 of
As an example, maintenance processors 24A may determine partition ID 36A by first parsing selection information 30A to extract which other of cells 14 belong to its partition. Typically, maintenance processor 24A extracts all other cell identifiers 44 that belong in its partition. Assuming both cells 14A, 14B belong to partition 38, maintenance processor 24 would extract cell identifiers 44A, 44B, which, as described above, may equal “0” and “1,” respectively. Next, maintenance processor 24A would determine partition ID 36A by forming a bit field, where each bit represents whether a cell of a given cell identifier belongs to partition 38. Using the above cell identifiers of “0” and “1” as an example, the following four bit bit field could be constructed for a four cell system:
The bit position corresponds to unique cell identifiers 44 and the bit field at each bit position indicates whether cells 14 having the cell identifier corresponding to that bit position belong in a given partition. In the above example, both cells 14 having the cell identifier equal to “1” and “0,” i.e., cells 14A, 14B in the above example, belong to the partition because a bit field of “1” indicates they belong, while a bit field of “0” indicates those cells having a cell identifier 44 corresponding to “3” and “2” do not belong within the partition. To determine partition ID 36A, maintenance processor 24A may calculate a sum according to the following equation:
where Bi equals the bit field at the ith bit position. Thus, according to the above equation, maintenance processor 24A would calculate partition ID 36A of 3, as 0*23+0*22+1*21+1*20=3. Another more involved example of computing a partition ID that includes a master cell designation is described below in reference to
Summing the bit field yields a unique partition ID that may be used to uniquely identify not only the partition but also any intra-partition communications. Each of cells 14A, 14B independently calculates an identifier (i.e., a “partition ID”) in, for example, the manner described above and exchange calculated partition IDs 36A, 36B to determine whether cells 14A, 14B belong to the same partition. If, for example, partition ID 36A equals partition ID 36B, then cells 14A, 14B, upon receiving and comparing these partitions IDs 36A, 36B, may determine they belong to the same partition. However, if partition IDs 36A, 36B are not equal to one another, cells 14A, 14B may determine that they belong to separate and distinct partitions. Moreover, use of the partition IDs is not limited merely to partitioning but may be used for subsequent operation. In some embodiments, the partition ID may be used for communications, such as use as a unique partition address and/or for forming a virtual intelligent platform management bus (IPBM) to facilitate secure intra-partition communication.
In this way, the cells participating in a partition can perform or automate at least some of the partitioning process including generation of the partition ID. As a result, the administrator need not manually keep track of or be concerned with the multitude of partition identifiers and cell identifiers and the associations between these and possibly other identifiers.
In the example shown in
After generating or updating configuration information 46A, 46B if appropriate, maintenance processors 24A, 24B begin the boot process by powering on one or more of processors 40 of respective processor clusters 18A, 18B. One or more of processors 40 of each of processor clusters 18A, 18B load respective BIOSs 42A, 42B and if appropriate additional configuration information 46A, 46B. Processors 40 of each of processor clusters 18A, 18B execute according to respective BIOSs 42A, 42B and, at some point during this execution, encounter configuration information 46A, 46B, which they respectively execute to form partition 38.
In the above described embodiment involving a master cell, one or more of processors 40 of processor cluster 18A executes BIOS 42A that uses as input configuration information 46A. Processors 40, in this instance, communicate with cell 14B via I/O interface 16A, whereupon processors 40 request that cell 14B form partition 38. In response to this request, processors 40A of processor clusters 18B access configuration information 46B to determine whether cell 14B belongs to partition 38. Assuming configuration information 46B indicates that cell 14B belongs to partition 38, processors 40 respond to the request. Upon responding, processor clusters 18A, 18B cooperate to form partition 38 such that they execute a single operating system across partition 38.
If, however, configuration information 46B indicates that cell 14B belongs to some other partition, processors 40 of processor cluster 18B may not respond to the request, i.e., remain silent. In this manner, the techniques described herein allow partitions to be asymmetrical. That is, cell 14A may view its partition as including cell 14B, while cell 14B may view its partition as including only itself, for example. This silence may indicate, for example, to processors 40 of processor cluster 18A that cell 14B is malfunctioning. Cell 14A in this instance may send an alert to administrator 10, but processor cluster 18A may still load and execute an operating system despite the failure to include cell 14B in partition 38.
The notion of asymmetrical partitions may be beneficial in many instances, and particularly during times of cell malfunction. During times of cell malfunction, asymmetrical partitioning enables administrator 10 to move a malfunctioning cell 14B, for example, into its own partition for diagnostic and testing purposes without considering the affect on other partitions, such as partition 38. In other words, administrator 10 may interact with user interface 32 of
Administration 10 need not consider the affects on partition 38 because cell 14A continues to operate even though cell 14B has not joined partition 38. After repairing or possibly replacing cell 14B, administrator 10 may simply include cell 14B back into partition 38, whereupon cells 14A, 14B reestablish partition 38 without further oversight from administrator 10. Again, by operating according to the decentralized partitioning techniques described herein, cells 14 may significantly lessen the burdens associated with scalability in conventional multiprocessing computing systems, thereby increasing scalability while possibly reducing the total cost of cellular multiprocessing computing system 12.
As shown in
As shown in
Master cell region 62A also includes a plurality of bits (e.g., bits N-1 to 0 shown at the top of partition ID 58A in
For example, participating cells region 60B of
Master cell region 60B acts as an offset because it offsets the bit number of participating cells region 60A. In the above example, bit 2 identifies cell having a cell ID of “1” because master cell region 60B includes 2 bits, i.e., bits 0 and 1, and those 2 bits are subtracted from the beginning bit of participating cells region 60B, e.g., bit 2, to yield bit corresponding to a cell ID of “0.” However, as described above, the cell identified by cell ID of “0” is not included in the calculation, therefore the next cell ID in the sequence is designated by bit 2 of participating cells region 60B, i.e., cell ID of “1.”
As another example, if a cell within the four cell system was identified by a cell ID of “1” and this cell was calculating partition ID 58B, the three bits of participating cells region 60B would indicate that a cell having a cell ID of “0” participates in the partition, as bit 2 of participating cells region 60B is a “1.” Cells identified by cell IDs of “2” and “3” would not belong to the partition because bits 3 and 4 respectively are a “0.” Further examples can be provided for cells identified by cell IDs of “2” and “3” however these are not provided herein as they are merely repetitive. Thus, each cell may conserve storage space by possibly not including itself as a bit within participating cells regions 60.
In this other example, master cell region 60B still serves as an offset but unlike the preceding example, the cell identified by cell ID of “1” is calculating partition ID 58B and thus, bit 2 of participating cells region 62B correctly identifies the cell having a cell ID of “0.” However, when calculating bit 3, the cell identified by cell ID “1” is not included within the calculation, and therefore the next cell ID in the sequence is designated by bit 3 of participating cells region 60B, e.g., cell ID of “2.” This process is repeated for each master cell offset bit greater than 1, which can be generalized and restated as: for each bit greater than or equal to the cell ID of the cell generating the partition ID, after being offset by the master cell offset, the bit must be increased by one to reach the appropriate cell ID that correlates with the bit stored to the original bit number.
To conclude the example, master cell region 62B identifies a cell having a cell ID of “1” as the master cell of the partition identified by partition ID 58B. If this cell “1” is calculating partition ID 58B, it will assume the responsibility of formulating the partition identified by partition ID 58B, as described above. If another cell of the partition is calculating partition ID 58B, which in this instance would only be a cell having a cell ID of “0,” cell “0” would await for cell “1” to form the partition, also as described above.
Initially, BMC 22A of cell 14A receives selection information 30A and stores selection information 30A to memory 28A, as described above (64). Maintenance processor 24A of BMC 22A calculates partition ID 36A based on selection information 30A (66). In some embodiments, maintenance processor 24A calculates a partition ID 36A to include only a participating cells region, as described above. In other embodiments, maintenance processor 24A calculates a partition ID 36A that is substantially similar in format to partition ID 58B of
Processor cluster 18A next receives a signal from BMC 22A to power up and begin booting BIOS 42A, which as described above may incorporate configuration information 46A. Processor cluster 18A, and in particular, one or more of processors 40 may in the manner described above load configuration information 42A which causes processors 40 to establish partition 38 (70). Again, this same procedure may also occur contemporaneously in cell 14B, such that cells 14A, 14B may form partition 38.
Processors 40 for each of cells 14A, 14B may determine first if the partition identified by partition IDs 36A, 36B comprises a multi-cell partition (72). Assuming that partition 38 is identified in a first instance such that it includes both of cells 14A, 14B, these cells 14A, 14B would each determine that partition 38 is a multi-cell partition. In the event, partition IDs 36A, 36B identify a master cell, cell 14A or 14B identified as master would initiate the below process by which partition 38 is formed, as described above. If however no master cell is specified within partition IDs 36A, 36B, either or both of cells 14A, 14B may initiate the below process by which partition 38 is formed. It is assumed, regardless of whether a master was designated, that cell 14A performs the following processes, however cell 14B may equally perform this process and the techniques should not be limited strictly to the described embodiment.
To form partition 38, processors 40 of processor cluster 18A communicate with processors 40 of processor cluster 18B via respective I/O interfaces 16A, 16B to merge BIOS 42A with BIOS 42B (74). Once merged, processors 40 of both clusters 18A, 18B boot a single operating system across both cells 14A, 14B, as described above (76). Moreover, each of processors clusters 18A, 18B may configure intra-partition network communications channels such that cells 14A, 14B of partition 38 may communicate securely (77). As one example, cells 14A, 14B may form a virtual intelligent platform management bus (IPMB) in accordance with virtual intelligent platform management interface (IPMI) techniques, as described in further detail in the above referenced co-pending application entitled “Mainframe Computing System Having Virtual IPMI Protocol,” by named inventors J. Sievert et al.
However, if processors 40 of processor cluster 18A determine that partition 38 does not include multiple cells but only a single cell 14A (contrary to partition ID 38 shown in
If no issues exit, cells 14A, 14B continue to execute within their respective current partitions (82). However, if a cell malfunctions, all of the other cells in the partition and particularly BMCs 22 may issue an alert to a system maintenance processor, e.g., system maintenance processor 34, as described above, and attempt to repartition such that the malfunctioning cell is no longer physically included within the partition (although logically the partition ID for that partition may still indicate that the malfunctioning cell belongs to the partition) (82, 72-82).
For example, if cell 14B malfunctions within two cell partition 38, cell 14A may issue an alert to maintenance processor 34 indicating that cell 14B malfunctioned. Cell 14A may then reestablish partition 38, but because cell 14B is most likely still malfunctioning, only cell 14A is physically included within partition 38. In this instance, administrator 10 may move cell 14B to a new partition establishing the above described asymmetrical partitions. Cell 14A may continue to send alerts that cell 14B is not present within the partition but it will no longer attempt to reestablish the partition until it detects cell 14B or until administrator 10 places cell 14A into its own or another partition. Once cell 14B is fixed or replaced, administrator 10 may further insert cell 14B back into partition 38, whereupon cells 14A and 14B automatically reform partition 38 (but only if administrator 10 did not repartition cell 14A into its own or another partition) in accordance with the decentralized hardware partitioning techniques described above. Various embodiments of the invention have been described. These and other embodiments are within the scope of the following claims.
The entire contents of co-pending application Ser. No. ______, filed ______, entitled “Mainframe Computing System Having Virtual IPMI Protocol,” by named inventors J. Sievert et al., attorney docket number RA-5847, are hereby incorporated by reference as if fully set forth herein.