The present invention relates generally to techniques for providing processing services within a multi-processor computing system, and, in particular, to techniques for providing a workload processing metering for tasks of multiple workload types in a multi-processor computing system.
Initially, Java workloads and non-Java workloads ran together in a single processor set. Java workloads tended to be processor-intensive, thus dominating the processor resources on the system. To alleviate the problem, recently developed servers came up with Secure Java Workload which defined two separate processor sets, one of which is used for Java workloads and the other is used for everything else (standard workloads). Furthermore each processor set could be set to a different level of performance. Thus the processor-intensive Java workload could be set to run at a high processor performance while the standard workload could be set to run at a low level of performance. This works well in a processor capacity-managed system. Details regarding this multi-workload processor based computing system is described in more detail in concurrently filed and commonly assigned U.S. Provisional Patent Application entitled “System And Method For Separating Multiple Workloads Processing In A Single Computer Operating Environment,” by Thompson et al., Attorney Docket No. TN472, Ser. No. 60/795,640, filed 27 Apr. 2006, which is incorporated by reference herein in its entirety.
On processor power metering systems, real-time CPU processor time statistics along with the actual hardware configuration and the processor performance level are used to determine specific metering utilization values. When combined with an interval, the meter values become metering utilization statistics that are accumulated into the system metering values. When Java processors are defined, Java task time is accumulated against Java processors. However when Java processors are not defined, Java tasks will continue to execute and accumulate time against standard processors. Thus metering of the standard processor set does not guarantee that only non-Java task CPU time is accounted in the meters.
Problems in the prior art are addressed in accordance with the principles of the present invention by providing a workload processing metering for tasks of multiple workload types in a multi-processor computing system.
In one embodiment, the present invention is a computing system having multiple processors in which these processors are configured to support a plurality of workload types may provide processing metering by workload types. In such a system, metering provides a measure of the normalized processing throughput utilized by processing tasks for each workload type supported by the system. This metering measures processing throughput for tasks of any given workload type that is performed by a processor configured to support that workload type as well as tasks of that type which may also be performed on a standard processor. The computing system providing processing metering for tasks from a plurality of workload types and comprises one or more processor set executing processing tasks associated with a standard workload type, one or more processor sets executing processing tasks associated with a particular workload type, and a server control module for collecting processing time associated with tasks of the standard workload type and with tasks associated the particular workload type.
In another embodiment, the present invention corresponds to a method for providing processing metering for tasks from a plurality of workload types. The method periodically collects processing time for tasks of each of the plurality of workload types running on a processor set associated with the particular workload type, periodically collects processing time for tasks of each of the plurality of workload types by workload type that are running on a standard processor set, saves prior accumulated processing time for tasks of a particular workload type running on a processor set associated with the particular workload type when the processor set changes to process a different workload type, and totals the collected and the saved processing times for each of the plurality of workload types from all processor sets.
In yet another embodiment, the present invention corresponds to a data storage media containing computer readable data encoded with instructions that when executed in a computing system implements a method for providing processing metering for tasks from a plurality of workload types. The method periodically collects processing time for tasks of each of the plurality of workload types running on a processor set associated with the particular workload type, periodically collects processing time for tasks of each of the plurality of workload types by workload type that are running on a standard processor set, saves prior accumulated processing time for tasks of a particular workload type running on a processor set associated with the particular workload type when the processor set changes to process a different workload type, and totals the collected and the saved processing times for each of the plurality of workload types from all processor sets.
Other aspects, features, and advantages of the present invention will become more fully apparent from the following detailed description, the appended claims, and the accompanying drawings in which like reference numerals identify similar or identical elements.
a-3b illustrate example embodiments of a multi-processor-based processing system configured as various workload sets according to one embodiment of the present invention;
Two different OS partitions 120-121 are present in the example configuration of
Within each cell, a set of processor are present along with system memory and I/O interface modules. For example, cell 0101 includes processor 0, processor 1, processor 2, and processor 3111-114, I/O interface module 115, and memory module 116. Peripheral devices 117-118 are connected to I/O interface module 115 for use by any tasks executing within OS partition 0120. All of the other cells within system 100 are similarly configured with multiple processors, system memory and peripheral devices. While the example shown in
The computing system 101 also includes processing unit 201, video display adapter 222, and a mass memory, all connected via bus 202. The mass memory generally includes RAM 203, ROM 204, and one or more permanent mass storage devices, such as hard disk drive 232a, a tape drive, CD-ROM/DVD-ROM drive, and/or a floppy disk drive 232b. The mass memory stores operating system 221 for controlling the operation of the programmable computing system 101. It will be appreciated that this component may comprise a general purpose server operating system as is known to those of ordinary skill in the art, such as UNIX, MAC OS X™, LINUX™, or Microsoft WINDOWS XP™. Basic input/output system (“BIOS”) 215 is also provided for controlling the low-level operation of computing system 101. While the example of
The mass memory as described above illustrates another type of computer-readable media, namely computer storage media. Computer storage media may include volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information, such as computer readable instructions, data structures, program modules or other data. Examples of computer storage media include RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital versatile disks (DVD) or other optical storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information and which can be accessed by a computing device.
The mass memory also stores program code and data for providing a host computing system. More specifically, the mass memory stores applications including host application program 213, user programs 214, and distributed firewall module 212.
The computing system 101 also comprises input/output interface 214 for communicating with external devices, such as a mouse 233a, keyboard 233b, scanner, or other input devices not shown in
The embodiments of the invention described herein are implemented as logical operations in a general purpose computing system. The logical operations are implemented (1) as a sequence of computer implemented steps or program modules running on a computer system and (2) as interconnected logic or hardware modules running within the computing system. This implementation is a matter of choice dependent on the performance requirements of the computing system implementing the invention. Accordingly, the logical operations making up the embodiments of the invention described herein are referred to as operations, steps, or modules. It will be recognized by one of ordinary skill in the art that these operations, steps, and modules may be implemented in software, in firmware, in special purpose digital logic, and any combination thereof without deviating from the spirit and scope of the present invention as recited within the claims attached hereto. This software, firmware, or similar sequence of computer instructions may be encoded and stored upon computer readable storage medium and may also be encoded within a carrier-wave signal for transmission between computing devices.
a-3b illustrate example embodiments of a multi-processor-based processing system configured as various workload sets according to one embodiment of the present invention. In these two embodiments, OS partition 0120 of
Java processors 311-312 in this example correspond to processors that are configured to efficiently perform Java tasks 301-302. These processors may be configured to utilize different microcode instructions applicable to Java tasks. These processors may possess customized hardware to support the Java tasks. Finally, these processors may be configured to operate at a particular performance level relative to a maximum possible processing throughput to adequately support Java tasks.
Standard processor 313 corresponds to a processor that is configured to support most other processing tasks 303 present within OS partition 0120. This processor 313 may not necessarily possess customize microcode or specialized processing hardware. Additionally, processors may be configured to operate at a different performance level relative to a maximum possible processing throughput to provide cost effective processing. In some embodiments of multi-processor systems, users are billed for the system providing a pre-defined processing throughput. When a higher level of processor performance provided, a user may be charged a higher cost. As such, processing levels for the standard processors may be set accordingly.
When a task is executed within an OS partition 120, the task is assigned to a particular processor depending upon whether the tasks is a Java task 301 or a standard task 303. A child task 302 that is created by an existing task 301 is classified as a task of the same workload type. Java tasks 301-302 are performed by Java processors 311-312 when they are present within a configured system. If a Java processor is not included within a configured system, the Java tasks 301-302 are performed by a standard processor.
b illustrates the one cell example from
Throughout the entire description of various embodiments of the present invention, examples for two workload types, Java and standard tasks, are described. The choice of using two types of task for possible workload types has been made for illustrative purposes only and is not intended to limit the invention in any way. Alternate workload sets in which processing tasks may be organized into a common set of tasks to be performed on its own processor may be used in place of Java processors and Java tasks as described herein. The characteristics for the alternate workload type processor may be configured as necessary to support the particular workload type and its corresponding tasks.
Similarly, systems may be configured to contain any number of workload types. In such an embodiment, processors from a multi-processor system of
The solution to the problem must take into account several factors: how the system metering is licensed: Prior metering systems metered the processors homogenously—no workload differentiation, all tasks metered at equal weight. This method was impractical for processor-intensive Java workloads. The new metering system separates Java workloads from non-Java workloads and meters only the non-Java workload. Subsequent meter systems may separately meter Java workloads and non-Java workloads. The workload group or target processor set that each task is assigned to must be known.
Each processor must be assigned to either the standard processor set or Java processor set. Each processor set must be assigned a specific level of performance (i.e., each CPU in the set is set to the same performance level).
Metering values are calculated using: Accumulated CPU time statistics (how many total busy CPU seconds were accumulated); The CPU configuration (how many processors and how close are the respective processor memory caches); The CPU performance level (how fast were each of the processors set to run); A metering time interval (metering values are updated once every minute).
While metered processing power utilization is not the same as CPU time resource utilization, CPU time resources consumed is one of the factors used in calculating metered processing power resources consumed. Currently, every processor has a set of CPU time counters that is updated every time a task is executing on that processor. This invention creates a new array of processor time counters in which there is one entry for every workload and one array for every processor. Every time a task is executing on a processor, the normal CPU time counter is updated (that has not changed) and the workload counter is updated for that processor. If the executing task is a Java workload, the Java workload counter is updated. If the executing task is a standard workload, the workload counter is updated. The time is accumulated in real-time. The task target processor set is the entity that identifies which workload counter should be updated.
So non-Java CPU time is accumulated on a processor-by-processor basis and Java CPU time is accumulated on a processor-by-processor basis. But previously, it was also said that each processor set can be running at a different level of performance. Thus from one processor to the next, the value of a CPU second may not be the same. The concept of a normalized CPU second is invented to fix this discrepancy. Each processor is actually running a level of performance that is less than or equal to “native” performance. Normalized CPU time effectively applies an exchange rate that converts actual CPU time into CPU time if it were running on a native processor. So if a task is executing on a CPU for 1 second and the CPU is throttled at 50%, the actual CPU time is 1 CPU second and the normalized CPU time is 0.5 CPU seconds.
Normalized CPU time is maintained by the operating system on both a task basis and a workload basis. This enables direct comparison of CPU resources used even when (1) each processor may be running at a different level of performance at any instance, and (2) the operating system may adjust the level of performance for any processor from one instance to the next. Thus back end billing programs that rely upon CPU resources consumed and metering that uses CPU resources consumed to calculate meter values will use this type of CPU resource consumption information.
A number of system interfaces are defined that provides users the ability to query normalized task CPU time and normalized workload CPU time. For metering systems, normalized workload CPU times are used in determining the workload metering utilization values. For new metering systems, a SYSTEMSTATUS type 25 interface is used to coalesce the system-wide processor time accounting information from all of the processors into a single time accounting array.
Thus there is sufficient information to calculate the interval meters. For new metering systems, meters are updated every minute. Metering values for workload differentiated metering systems use the same metering calculation function previously described, but factors in the normalization of the CPU time because workloads may have crossed over to processors that are not the normal processors associated with that workload.
Metering values on workload differentiated systems are calculated using: Accumulated normalized CPU time statistics for that workload; Calculate adjusted elapsed CPU time statistics reflecting target processor set (if configured) or standard processor set (if target processor set is not configured); The CPU configuration (how close are they to each other with respect to their respective processor memory caches); CPU set performance level (the set associated with the prior adjustment); and A metering time interval (metering values are updated once every minute).
Furthermore, workload differentiation allows for metering licensing variations that are more suited to the product that is being delivered. For new metering systems, marketing has decided to license metered non-Java (or standard) performance and to provide an optional license for unmetered Java performance. This allows customers to run a metering system for their normal workloads while at the same time develop their Java environment in a non-interfering environment that does not accumulate any monthly charges.
The following are new features of this invention: This invention provides a means to license multiple workloads in a single partition of which one or more of the workloads is metering. This invention provides a means to independently specify which workloads are metering and which are controlled by normal capacity management. This invention provides a means to separate CPU statistics by workload when each workload is executing on the intended workload processor set. This invention provides a means to separate CPU statistics by workload when the workload is executing on the backup processor set. This invention provides a means to normalize all CPU statistics for all workloads gathered by any processor so that accumulated CPU statistics can be compared meaningfully. This invention provides a means to return the normalized workload CPU times in a single user interface. This invention provides a means to obtain the accumulated normalized workload CPU times and to recalculate adjusted elapsed CPU times for any workload for a given interval the purpose of update metering values for that workload.
The separation of workloads into multiple processor sets has been implemented in prior multi-processor computing systems. These prior systems were capacity licensed (non-metering) systems that were licensed for “N” Java processors that were to be used just for the Java workload and a customer purchased amount of performance that is to be used for everything else (the standard workload). When the partition would start, Java tasks would automatically execute on the Java processors and non-Java tasks would automatically execute on the standard processors.
Some restrictions of this model include: (1) Limited to the licensed number of Java processors (i.e., “N+1” Java processors are not allowed); and (2) Must have at least one standard processor online. The standard processor is the only processor that can execute any task. Therefore at least one standard processor must be online at any time. Attempts to DOWN the last standard processor results in automatic conversion of one of the online Java processors into a standard processor. If one were to use the concept of the separation of processors sets for the basis for workload metering, simplistic workload metering algorithms would monitor the CPU time accumulated against each processor set, use the processor set level of performance information along with the processor set configuration information and the interval, and come up with an incremental metering utilization value for the workload during the prior interval. The concept is simple and is analogous with what is done with single workload metering on other metering system. Where these prior methods of metering falls short is in restriction number 2.
There are many reasons that the number of Java processors may become zero. One reason could be choice. For example, a customer can choose to eliminate Java processors with an operator command (e.g., IK IPSET JAVA—IP-1-0). Another reason is one of resiliency. For example if a system consists of two processors where one is Java and the other is standard, a fault in either processor will invoke an processor set recovery process that will attempt to recovery the failed processor. If that recovery fails, then the remaining processor will be a standard processor. In any case, when the number of Java processors becomes zero, Java tasks will continue to execute, but the workload will now execute on standard processor contending for processor resources with the standard workload. If one were to use the simplistic metering method described solely based upon the executing processor set, then the customer would be unfairly charged for the Java workload processing that is taking place on standard processors.
The following describes the new features of this invention:
Computer systems are designed by nature to handle a variety of workloads. Although the concepts behind this licensing mechanism extends far beyond this invention, the mechanism described in this invention specifically differentiates and licenses two workloads: Java workload and everything else (standard workload). Furthermore, the multiple workloads are licensed on a system-wide basis where metering parameters can be independently selected for the individual partitions. The structure of the metering key supports multiple partitions, selectable metering parameters for the standard workload, and an optional unmetered Java workload component Furthermore, the key can be extended to support Java metering when that becomes a supported feature.
The Java processor set capability is enabled using existing processing power metering keys with a format that supports metering standard workload performance and non-metered Java workload performance. These keys will use the existing version 9 key structure. Words 1 through 4 contains the fixed information for the key (see Fixed Key Information).
Variable Key Information—This information is appended to the fixed information in the key. The variable key information is always in a format that consists of 5 bit group identifier followed by the specific group information. The “Variable Key Information” table displays the groups and the defined structure for each group. The “Group” column shown is the group identifier that is stored in the first five bits of each group section. Multiple images are signified simply by creation of multiple image groups. The value 0 indicates there are no more groups to process.
In the key creation process, once the data for the key parameters are fully determined and the end of the key is signaled, the binary data is the encrypted by the key encryption program. For new metering systems, the key encryption program generates a key that consists of a string that begins with “IP1-” followed by at least 52 apparently random characters.
The following is an example of a new metering key that defines 2 metering partitions, each of which is licensed to include Java processors. The output illustrates the encrypted key and the decrypted key parameters. Notice the key string begins with “IP1-” followed by 52 apparently random characters. When decrypted, these characters decode into the subsequent key parameter information.
To view the current metering licensing state on a new metering system, the IK IPSHOW ALL operator command can be entered. This is an example of a partition (partition 7) that is configured with 1 Java processor and 2 standard processors running at a level of performance (20K RPM) that is less than the ceiling (25K RPM)—see “CURRENT IMAGE”. A system-wide view of the images that are in use by each partition is also displayed. The active key status and key licensing information is also displayed. The example key consists of two partition images each of which consists of a standard performance rating component (STD RPM rating) and a Java processor count component. Key metering parameters and the key string itself are also displayed.
Licensing of separate processor sets results in setting each processor set to a different level of performance. For Java tasks, there is a Java optimized level of performance on Java processors that cannot be achieved at the native level of performance offered on standard processors. Thus Java tasks perform better on Java processors than they do on native standard processors. Furthermore when Java processors are configured, Java tasks only execute on Java processors.
However it is also possible, but unlikely, that extremely critical OS functions execute on Java processors. On a metering system, this crossover standard workload CPU time accumulated against Java processors should count against the standard meters.
When Java processors are not configured, Java tasks continue to execute contending with all tasks on standard processors. On a metering system, it would be unfair that this crossover Java workload CPU time accumulated against standard processors should count against the standard meters.
Ideally it would be good to have one mechanism that would handle both scenarios. This solution does just that. In addition to CPU time accumulators normally associated with every processor (e.g., Idle, IO, User Task, Process Switch, etc.), each processor is allocated additional workload CPU time accumulators. The additional size of the array elements should be (Max number of processors) times (Max number of workloads). The accumulators are internal operating system structures allocated one per processor.
Furthermore because each processor can be set to a different level of performance, each processor is associated with a value that is the current exchange rate or normalization rate value. This value is used to convert value of the current CPU second into the so-called normalized CPU second. The normalized processor time is the time that is actually accumulated for any given target workload process.
So what happens when a Java task enters the mix and a Java processor does not exist. In this scenario, the task will be identified as having a target Java processor set, but will contend with every other task for the standard processors. The other tasks will be identified as having a target of standard processor set. Note that accumulation of normalized workload CPU time is by the target processor set. So for every task that executes on the CPU, the task will select one of 2 bucket into which to accumulate the normalized CPU time. Thus, if a Java task executes for 1 second on CPU 2, the user task CPU time will be incremented by 1 second and the normalized Java workload CPU time will be incremented by 1*0.8 or 0.8 seconds. Normalization of CPU times always reflects the actual speed that the CPU is current running at.
Metering systems with Java workloads are only concerned with separating the Java component of the workload out so that accurate metering of the non-Java component can be accomplished. Thus metering systems require the following information: Accumulated normalized CPU time statistics for the standard workload; Reconstitute adjusted CPU time statistics reflecting standard processor set performance level; Standard processor set CPU configuration (how close CPU memory caches are to each other); Standard processor set CPU performance level; and A metering time interval (metering values are updated once every minute)
Based upon the information in
The SYSTEMSTATUS type 25 interface was originally designed to return many CPU utilization counters both for system-wide CPU time accumulation and for individual task CPU time accumulation. This interface has been modified to also return normalized workload CPU time accumulation information that reflects the information illustrated in
System Wide Processor Time Group
Groups with an IDTYPEF of REDUNDANTACCOUNTV or SC_REDUNDANTACCOUNTV contain accounts that reflect processor time also billed to individual stack processor time account groups. Groups with an IDTYPEF of SC_SYSTEMWIDEACCOUNTV or SC_REDUNDANTACCOUNTV contain Processor Set information. On prior CoD and metering systems, there may be Processor Set entries from 0 through the largest value indicated by the VALIDPROCSETS mask. On other systems, there will only be Processor Set 0 and VALIDIPSETS will be a 1. The layout of the System Wide Processor Time Group is as follows:
Now that the metering software has the normalized CPU times for each workload, it can now choose to focus on the workloads it is metering. For new metering systems, only the standard workloads are being metered. The Java workload processor utilization is ignored. For future systems that may change, and the operating system infrastructure is in place to handle that when it occurs. However for new metering systems, only the normalized standard CPU time is used.
Furthermore, the metering software is aware of the performance level of the standard processor set because it is the metering software that originally set both the performance level and the processor normalization factor as part of the licensing. That means that the metering software can also recalculate the original elapsed CPU time based from the normalized workload CPU time. For example if 24 seconds of normalized time have accumulated in the last minute and the normalization factor for the processor set is 0.8 then the actual elapsed CPU time for that processor set is 24/0.8 or 30 seconds. Thus the total user task CPU time for the standard processors for the last minute would have been 30 seconds.
With a recalculated elapsed CPU time (with the Java workload component removed), a known CPU configuration, a known CPU performance level, and a known interval, an incremental meter value can be calculated using our meter tables and can be added to the system meters.
The meter report itself indicates accumulated utilization of the standard workload component of the system over the month.
Reference herein to “one embodiment” or “an embodiment” means that a particular feature, structure, or characteristic described in connection with the embodiment can be included in at least one embodiment of the invention. The appearances of the phrase “in one embodiment” in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments necessarily mutually exclusive of other embodiments
The present invention can be embodied in the form of methods and apparatuses for practicing those methods. The present invention can also be embodied in the form of program code embodied in tangible media, such as floppy diskettes, CD-ROMs, hard drives, or any other machine-readable storage medium, wherein, when the program code is loaded into and executed by a machine, such as a computer, the machine becomes an apparatus for practicing the invention. The present invention can also be embodied in the form of program code, for example, whether stored in a storage medium, loaded into and/or executed by a machine, or transmitted over some transmission medium or carrier, such as over electrical wiring or cabling, through fiber optics, or via electromagnetic radiation, wherein, when the program code is loaded into and executed by a machine, such as a computer, the machine becomes an apparatus for practicing the invention. When implemented on a general-purpose processor, the program code segments combine with the processor to provide a unique device that operates analogously to specific logic circuits.
The present invention can also be embodied in the form of a bitstream or other sequence of signal values electrically or optically transmitted through a medium, stored magnetic-field variations in a magnetic recording medium, etc., generated using a method and/or an apparatus of the present invention.
Unless explicitly stated otherwise, each numerical value and range should be interpreted as being approximate as if the word “about” or “approximately” preceded the value of the value or range.
It will be further understood that various changes in the details, materials, and arrangements of the parts which have been described and illustrated in order to explain the nature of this invention may be made by those skilled in the art without departing from the scope of the invention as expressed in the following claims.
The use of figure numbers and/or figure reference labels in the claims is intended to identify one or more possible embodiments of the claimed subject matter in order to facilitate the interpretation of the claims. Such use is not to be construed as necessarily limiting the scope of those claims to the embodiments shown in the corresponding figures.
Although the steps in the following method claims, if any, are recited in a particular sequence with corresponding labeling, unless the claim recitations otherwise imply a particular sequence for implementing some or all of those steps, those steps are not necessarily intended to be limited to being implemented in that particular sequence.
This application claims the benefit from the filing of U.S. Provisional Application Ser. No. 60/795,627, entitled “System And Method For Separating Multi-Workload Processor Utilization On A Metered Computer System” by Hoffman, et al., filed 27 Apr. 2006, the entire content of which is incorporated herein by reference in its entirety.
Number | Date | Country | |
---|---|---|---|
60795627 | Apr 2006 | US |