The present invention relates to workload scheduling systems and more particularly to a data processing method and system for automatically optimizing workload scheduling.
Workload scheduling systems provide a means for scheduling a complex set of automated tasks on the machines of an information technology (IT) infrastructure, ensuring that each task is executed on time only when any dependency or prerequisite condition is met. To provide such assurances upon execution of each task, a conventional workload scheduling system allows the definition of tasks to be executed together with a set of rules to be met for creating an execution plan. The workload scheduling system elaborates the rules, considering time dependencies and any other constraint, and builds a plan whose execution is monitored by a user. By human intervention via a manual review of information included in execution reports, a user attempts to understand the workload and determine actions to take to optimize resource consumption as much as possible. The manual activities of reviewing the execution reports, analyzing the workload and determining the optimization actions to take are time consuming and error prone. Furthermore, known monitoring systems can take action at plan execution time, thereby dynamically reacting to conditions of non-optimal resource usage. These conventional monitoring systems, however, cannot prevent the same conditions from appearing again in the future. Thus, there exists a need to overcome at least one of the preceding deficiencies and limitations of the related art.
The present invention provides a computer-implemented method of automatically optimizing workload scheduling. A computing system receives user-defined target values for predefined workload characteristics. The target values are characteristics of a workload in an information technology infrastructure. The computing system receives user-defined constraint specifications. Each constraint specification includes a range of values or a set of values. After receiving the constraint specifications and the target values, the computing system initiates a generation of a first execution plan. After initiating the generation of the first execution plan, the computing system selects a set of initial values for constraints. Each constraint is specified by one of the constraint specifications. Each constraint constrains tasks included in the workload. After selecting the set of initial values, the computing system generates and then executes the first execution plan. Executing the first execution plan includes determining measurements of the workload characteristics. After executing the first execution plan, the computing system determines a set of contributions, each contribution indicating a difference between one of the measurements of the workload characteristics and one of the target values. After determining the contributions, the computing system stores the contributions in a computer data storage unit. After determining the contributions, the computing system initiates a generation of a next execution plan. After initiating the generation of the next execution plan, the computing system modifies the constraints, resulting in a set of modified values of the constraints. Each modified value is specified by one of the constraint specifications. After modifying the constraints, the computing system evaluates changes in the workload characteristics. The changes are based on the set of modified values of the constraints for each time period of a set of predefined time periods in a duration of the next execution plan. After evaluating the changes, the computing system determines an optimal solution or an acceptable sub-optimal solution in a space of solutions defined by the constraint specifications, resulting in a set of new values for the constraints. After determining the optimal solution or the sub-optimal solution, the computing system stores, in a computer data storage medium, the set of new values for the constraints. After determining the optimal solution or the sub-optimal solution, the computing system replaces the set of initial values with the set of new values. After replacing the set of initial values, the computing system generates the next execution plan. The next execution plan includes the set of new values as the constraints. After generating the next execution plan, the computing system executes the next execution plan.
A system and a computer program product corresponding to the above-summarized methods are also described and claimed herein.
An embodiment of the present invention provides a user-defined and automatic optimization of workload scheduling (i.e., workload planning) in an IT infrastructure via an autonomic system and process that modifies the shape of an execution plan (a.k.a. workload execution plan). The automatic workload scheduling optimization system and process disclosed herein is based on a set of measurements that are taken at execution time to determine each single task's contribution to the overall workload. After collecting the information about the tasks' contributions, the system automatically determines how to change scheduling definitions so that the workload is optimized as requested by the user. As used herein, an “execution plan” is defined as a list of automated tasks scheduled for execution on a variety of computer systems in a predefined time frame. The execution plan includes information about tasks to be executed as well as information about time constraints and dependencies from physical or logical resources that are required by each task to complete. As used herein, “workload” is defined as the resource utilization generated by the tasks of an execution plan.
User-defined values for workload characteristics 106 and user-defined values for task constraints and dependencies 108 are entered by user(s) as input to scheduling system 104. Repository 112 stores data related to workload characteristics, including measurements of workload characteristics and evaluated contributions of tasks to workload characteristics. Repository 114 stores definitions of task constraints. The output of scheduling system 104 is an optimized execution plan 110. The functionality of the components of system 100 is described in more detail below relative to
1. The average number of tasks executed per unit of time
2. The central processing unit (CPU) usage (i.e., consumption)
3. The memory consumption
4. The input/output (I/O) read/write rate
For example, in step 202, the user enters 50% for CPU usage, 512 MB for memory usage, etc.
In step 204, the user sets values for constraints and/or dependencies 108 (see
Also in step 204, scheduling system 104 (see
In step 206, scheduling system 104 (see
In step 210 of
As one example, the scheduling system in step 210 calculates the CPU usage or memory consumption of each task in every minute, in addition to the start time and duration and other typical measurements. As another example, the scheduling system knows that a CPU was used 25% of the time by task A and 33% of the time by task B, for a total CPU consumption of 58%. This measurement of CPU consumption is calculated separately in multiple predefined time periods within the entire duration of the execution plan. For instance, an execution plan whose duration is 2 hours is divided into 24 time periods each 5 minutes in length. For each of these 5-minute time periods, the scheduling system determines the average CPU consumption for each task run within that 5-minute time period.
In step 212, scheduling system 104 (see
In step 214, scheduling system 104 (see
In step 216, scheduling system 104 (see
In step 218, by using the constraint values selected in step 216, scheduling system 104 (see
Continuing the example presented above relative to step 210, if task A is anticipated in a way that it no longer overlaps with task B (i.e., based on the estimated duration of the task itself, which is known from the execution plan results), then CPU consumption may never be raised to 58%. On the other hand, if task A is anticipated based on the ranges and/or sets of values defined for time constraints in step 204 (see
In step 220, using one or more algorithms (e.g., the “Greedy” algorithm or the “Branch and Bound” algorithm) for searching optimal or acceptable sub-optimal solutions in the space of solutions defined by the ranges or sets defined in step 204 (see
Depending on the particular algorithm used, step 220 may be described as a loop that includes steps 216 and 218 with an exit condition (not shown) (i.e., the optimal or acceptable sub-optimal solution is found) or as a different non-loop technique that considers the space of solutions (e.g., through mathematical means), looking for the optimal or acceptable sub-optimal solution. As used herein, an acceptable sub-optimal solution is a solution that meets predefined criteria.
In step 222, scheduling system 104 (see
The benefit of the overall solution provided by the process of
In contrast to conventional monitoring systems that take action at plan execution time to react to conditions of non-optimal resource usage, the scheduling mechanism described herein freezes the optimal solution within the definitions of the constraints set for each task. Such definitions of the constraints, which are generated from the above-described process, ensure that every time an execution plan is generated, the execution plan is optimal. Thus, no real-time monitoring activity is required because all tasks are shaped in advance to produce an optimized workload at scheduling time rather than during the execution of the tasks.
Memory 304 may comprise any known type of computer data storage and/or transmission media, including bulk storage, magnetic media, optical media, random access memory (RAM), read-only memory (ROM), a data cache, a data object, etc. In one embodiment, cache memory elements of memory 304 provide temporary storage of at least some program code in order to reduce the number of times code must be retrieved from bulk storage during execution. Moreover, similar to CPU 302, memory 304 may reside at a single physical location, comprising one or more types of data storage, or be distributed across a plurality of physical systems in various forms. Further, memory 304 can include data distributed across, for example, a local area network (LAN) or a wide area network (WAN).
I/O interface 306 comprises any system for exchanging information to or from an external source. I/O devices 310 comprise any known type of external device, including a display device (e.g., monitor), keyboard, mouse, printer, speakers, handheld device, facsimile, etc. In one embodiment, an I/O device 310 such as a display device displays the task constraints defined in step 204 (see
I/O interface 306 also allows computing system 102 to store and retrieve information (e.g., program instructions or data) from an auxiliary storage device such as computer data storage unit 312. The auxiliary storage device may be a non-volatile storage device, such as a hard disk drive or an optical disc drive (e.g., a CD-ROM drive which receives a CD-ROM disk). Computer data storage unit 312 is, for example, a magnetic disk drive (i.e., hard disk drive) or an optical disk drive.
Memory 304 includes computer program code 314 that provides the logic for automatically optimizing workload scheduling (e.g., the process of
As will be appreciated by one skilled in the art, the present invention may be embodied as a system, method or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment (including firmware, resident software, micro-code, etc.) or an embodiment combining software and hardware aspects that may all generally be referred to herein as a “system” (e.g., system 102). Furthermore, the present invention may tale the form of a computer program product embodied in any tangible medium of expression (e.g., memory 304 or computer data storage unit 312) having computer-usable program code (e.g., code 314) embodied in the medium.
Any combination of one or more computer-usable or computer-readable medium(s) (e.g., memory 304 and computer data storage unit 312) may be utilized. The computer-usable or computer-readable medium may be, for example but not limited to, an electronic, magnetic, optical, electromagnetic, infrared or semiconductor system, apparatus, device or propagation medium. A non-exhaustive list of more specific examples of the computer-readable medium includes: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a random access memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or Flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a transmission media such as those supporting the Internet or an intranet, or a magnetic storage device. Note that the computer-usable or computer-readable medium could even be paper or another suitable medium upon which the program is printed, as the program can be electronically captured via, for instance, optical scanning of the paper or other medium, then compiled, interpreted, or otherwise processed in a suitable manner, if necessary, and then stored in a computer memory. In the context of this document, a computer-usable or computer-readable medium may be any medium that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device. The computer-usable medium may include a propagated data signal with the computer-usable program code embodied therewith, either in baseband or as part of a carrier wave. The computer-usable program code may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc.
Computer program code (e.g., code 314) for carrying out operations of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C++ or the like and conventional procedural programming languages, such as the “C” programming language or similar programming languages. The program code may execute entirely on a user's computer (e.g., computing system 102), partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the latter scenario, the remote computer may be connected to the user's computer through any type of network (not shown), including a LAN, a WAN, or the connection may be made to an external computer (e.g., through the Internet using an Internet Service Provider).
The present invention is described herein with reference to flowchart illustrations (e.g.,
These computer program instructions may also be stored in a computer-readable medium (e.g., memory 304 or computer data storage unit 312) that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable medium produce an article of manufacture including instruction means which implement the function/act specified in the flowchart and/or block diagram block or blocks.
The computer program instructions may also be loaded onto a computer (e.g., computing system 102) or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide processes for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks.
The flowchart and block diagrams in
While embodiments of the present invention have been described herein for purposes of illustration, many modifications and changes will become apparent to those skilled in the art. Accordingly, the appended claims are intended to encompass all such modifications and changes as fall within the true spirit and scope of this invention.