Conventional computing resources involve deployment of physical hardware at the user site for providing the computing and storage requirements of the user. Further, the deployed configuration must be sufficient to provide acceptable performance at peak demand times, which may be substantially greater than an average load. Therefore, a large user base such as a corporation, university or other enterprise was forced to invest substantially in sufficient on-site resources to handle peak demand, in addition to allowing for an expected demand growth to avoid rapid obsolescence.
Advances in network technology, fueled in no small part by the Internet and other public access networks, however, have brought about networking capabilities sufficient to remove the users from the physical hardware environment and exchange computing resource requests and services remotely via a networked connection. Such performance has resulted in so-called “software as a service” (SaaS), or “cloud computing,” in which users such as corporations invoke remote servers for computing resources as needed, and thus pay only for a current level of resources demand. This relieves the need for investment in “worst case” system sizing and allows deployment of additional computing resources only when actually needed.
A virtual computing environment executes virtualization instances of computing systems as an autonomous computing entity in a physical environment shared with other virtualization instances. Each virtualization instance has a configuration including a processor type and quantity, memory, and mass storage (i.e. disk) allocation. Further, each virtualization instance has a performance capacity based on a performance metric for identifying throughput in terms of a target primary application that the virtualization instance was designated to support. The performance metric is an enumeration of operations typically performed by the target application that the virtualization instance can complete per unit time, such as transactions per minute, web pages per hour, packets per second, etc.
The performance metric is defined by a baseline developed, computed or derived from application operation on a particular configuration, meaning that a certain number of operations (performance metric) is achievable with a configuration including a certain number of processors at a particular speed having access to a given memory allocation. For example, user sites are often deployed with a configuration defining an initial state suitable for handling a number of expected operations, plus a “burst” or “spike” allowance to accommodate typical deviations. However, user demands often expand as a system matures. Over time, expectations placed on the application cause the initial state of the virtualization instance to result in a performance shortfall in meeting additional operations requested of it.
In the case of particular applications, the performance metric establishes a ‘bridge’ connecting certain business application and its throughput with the ideal and benchmark tested ‘cloud’ infrastructure settings, or configuration. For a particular application, such as a database application as depicted below, an operation is selected as an exemplary exchange or transaction that is the unit of performance. Subsequent cloud configurations, also discussed further below, provide for a performance level in terms of a number of these operations. The various configurations are therefore normalized by the benchmark number of operations that they achieve.
The aforementioned SaaS or cloud based approach typically involve a bank of servers that deploy virtual configurations for each of a plurality of users. The bank of servers deploy “virtualization instances” which provide a user experience similar to a dedicated machine having a predetermined CPU and memory capability defined by a configuration of the virtualization instance.
Configurations disclosed herein are based, in part, on the observation that deployment of virtual hardware in the form of virtualization instances (or simply “instances”) can result in overprovisioning (excessive computing power for the demand) or underprovisioning (insufficient resources). Effectiveness of a virtualized computing resource, as opposed to installed hardware, relies on optimizing required resources by neither over-configuring or under-configuring the computing resources needed to handle a current load, since the provisioned configuration is malleable and not fixed as a room full of hardware.
Unfortunately, conventional approaches suffer from the shortcoming that a large base of installed hardware is required for handling short-term bursts or spikes in usage in order to avoid a shortfall in resources. Even if virtualization instances are deployed, reconfiguration required for handling a spike or burst in demand may not be timely.
Accordingly, configurations herein substantially overcome the above described shortcomings by performing a substantially real-time reconfiguration response that configures additional computing resources (virtualization instances) based on an actual, not computed or projected, demand. The disclosed approach allows users such as corporations to effectively invoke remote servers having virtualization instances for computing resources as needed, and thus pay only for a current level of resource demand as the allocated computing resources provide substantially real-time response to a demand spike or surge, thus “elasticizing” applications to expand and contract to meet current demand. Such virtualization instances differ from conventional mainframe and terminal approaches because each user has a dedicated OS and processor unaffected by other users, in contrast to the conventional approaches where all users consumed a portion of a single available CPU in a shared or “time sliced” manner.
A typical response to such underprovisioning is to clone the virtualization instance of the initial state with a configuration (e.g. processor and memory) sufficient to overcome the performance shortfall by providing a corresponding increase in performance metrics. However, arbitrarily cloning a virtualization instance does not necessarily lend an exacting or mathematical correspondence to the resultant throughput. Often, the baseline of a particular virtualization instance is based on an unadulterated, or “pure” instantiation running only the target application for which the operations apply. In practice, the instantiation has often been burdened with additional applications and/or overhead such that mere cloning does not achieve the expected baseline.
Accordingly, attempts to overcome a performance shortfall by determining an additional demand based on a performance metric (operations) for accommodating the additional load is likely to fall short of a true performance metric required to adequately handle the new load. The difference defines an extrinsic load resulting from modifications and overhead demands outside of a “pure” instantiation designated for handling only the target application. Accordingly, a true performance metric that accounts for the computed additional load and the extrinsic load over the initial performance metric identifies computing resources sufficient to handle the additional load. The identified computing resources define a configuration including a number of processors (and speed), memory, and disk resources corresponding to the true performance metric to accommodate the identified performance shortfall. The identified computing resources therefore define a configuration, including processors, memory and disk as exemplary components. Alternate combinations of computing resources available via a cloud infrastructure may also be defined in a configuration geared to supporting a specific performance metric, i.e. number of operations of a target application, as discussed further below.
In further detail, the method for configuring a virtualization instance as disclosed herein includes instantiating a virtualization instance according an initial configuration, such that the initial configuration is based on performance demands of a target, or primary, application. Operational conditions result in an indication of a performance shortfall of a virtualization instance, in which the performance shortfall is based on a computing metric indicative of a measure of computing resources for performing an operation by a primary application for which the virtualization instance is configured to handle. The method includes determining, based on an impact analysis of a current state of the virtualization instance and the performance shortfall, a revised configuration having sufficient computing resources for handling a true computing demand including the performance shortfall, such that the current state includes an increase in computing demand beyond the initial configuration.
In the example arrangement discussed herein, in the virtual computing environment allocating instantiations of computing resources based on a computing metric of a primary application for execution on the instantiated computing resources, a method for increasing allocated computing resources includes receiving a load metric indicative of an additional load on an initial computing instantiation, such that the load metric defines a measure of additional computing resources responsive to the additional load. The impact analysis computes a true performance metric based on the received load metric for handling the additional load and extrinsic loads added after the initial computing instantiation, and instantiates an additional computing resource configured based on the computed true performance metric. In a particular configuration, a virtualization server ‘translates’ the business application's throughput (BATs), typically measured by business transactions/number of users, or other operations with the ideal and benchmark tested ‘cloud’ infrastructure settings, or configuration. A program can be built on top of this ‘translation tool’ to manage short-term bursts/spikes or on demand usage from a business application to achieve automatic application elasticity by instantiating additional VMs to meet demands.
Alternate configurations of the invention include a multiprogramming or multiprocessing computerized device such as a multiprocessor, controller or dedicated computing device in either a handheld, mobile, or desktop form or the like configured with software and/or circuitry (e.g., a processor as summarized above) to process any or all of the method operations disclosed herein as embodiments of the invention. Still other embodiments of the invention include software programs such as a Java Virtual Machine and/or an operating system that can operate alone or in conjunction with each other with a multiprocessing computerized device to perform the method embodiment steps and operations summarized above and disclosed in detail below. One such embodiment comprises a computer program product that has a non-transitory computer-readable storage medium including computer program logic encoded as instructions thereon that, when performed in a multiprocessing computerized device having a coupling of a memory and a processor, programs the processor to perform the operations disclosed herein as embodiments of the invention to carry out data access requests. Such arrangements of the invention are typically provided as software, code and/or other data (e.g., data structures) arranged or encoded on a computer readable medium such as an optical medium (e.g., CD-ROM), floppy or hard disk or other medium such as firmware or microcode in one or more ROM, RAM or PROM chips, field programmable gate arrays (FPGAs) or as an Application Specific Integrated Circuit (ASIC). The software or firmware or other such configurations can be installed onto the computerized device (e.g., during operating system execution or during environment installation) to cause the computerized device to perform the techniques explained herein as embodiments of the invention.
The foregoing will be apparent from the following description of particular embodiments disclosed herein, as illustrated in the accompanying drawings in which like reference characters refer to the same parts throughout the different views. The drawings are not necessarily to scale, emphasis instead being placed upon illustrating the principles disclosed herein.
The disclosed configuration performs an impact analysis using an initial configuration based on an initial performance metric (i.e. such as N operations per hour), and a load metric indicative of additional load need (N additional operations per hour). A true performance metric is based on the initial performance metric, in which the load metric and an extrinsic load were unaccounted for in the additional load, which includes extraneous and user imposed factors such as additional software and overhead not accounted for in a “computed” or “projected” additional load of the primary application. An instance manager then deploys a revised configuration for handling the true performance metric defined in terms of operations of the primary application.
Computing resources are often measured in terms of a computing metric, or simply “metric,” that defines a number of operations of a target application per unit time. Upon an initial deployment, the computing resource has a configuration deemed acceptable for handling an expected computing load defined by the computing metric. For example, a database application, such as SAP, may define so-called “SAPs” per hour, referring to a number of database exchanges (reads and writes) that may be performed. A different target application, such as an Oracle® database, may specify transactions per hour. Other target applications may specify an alternate metric, such as pages per hour for a web page based operation.
For the applicable target application, an initial configuration specifies a combination of computing resources, primarily defined by a number of processors, a processor speed, and available memory (RAM). Disk space may also be specified. This initial configuration is deemed to handle the initial computing load presented by the target (primary) application in an unburdened system. The unburdened system defines a “pure” installation for handling only the target application, without any other applications or extrinsic loads that consume computing resources and thus diminish a true performance metric that defines the number of operations actually achievable as the initial configuration becomes burdened with additional tasks.
The virtualization servers 120 (servers), such as a blade server 120, typically have a plurality of processors and memory for allocation to a plurality of users, and an interface to a mass storage subsystem 121, typically a set of disk drives or SSDs (solid state devices). In contrast to conventional multiprogramming environments, however, where multiple users share a single processing environment through an operating system scheduler, a virtualization environment employs an instance manager 123 (
The virtualization servers 120, therefore, each include one or more virtualization instantiations 150-1 . . . 150-3 (
To meet the performance demand, multiple virtualization instances 150, or VMs (virtual machines) may be configured to execute in parallel. Over time, however, performance of the initial configuration may be surpassed by user demand. An operator may elect to instantiate an additional virtualization instance 150 to address the performance shortfall. Determination of an optimal configuration for meeting the additional demand includes an impact analysis to determine an adequate configuration. The performance shortfall specifies a number of the operations of the primary application, such as a shortfall of 100 SAPs per hour, for example. However, additional user demands may present additional overhead and requirements such that a configuration based simply on the stated shortfall of operations may be insufficient to account for a true performance metric representative of a configuration sufficient to address the actual increase.
In alternate arrangements, various permutations of cloud based computing resources may define a configuration. In practice, the elusive and seemingly infinite “cloud” ultimately resolves to multiple virtual servers competing on/from the same physical resource (like physical CPU/Disk IO/network bandwidth/etc.), however such details are shielded from the user by the cloud infrastructure. Configurations herein normalize the cloud performance in terms of a standardized “operation” deemed to depict a representative transaction, or quantity of computing power, for a target application. A number of operations, therefore, define a performance level attributed to a particular configuration—an allocated combination of cloud resources including, but not limited to, a number and type of processors, memory, and disk (non-volatile storage). Various other attributes of a configuration may be “tuned,” or enumerated, to provide a tunable cloud infrastructure, such as in terms of bandwidth, response time, QoS (Quality of service) allocations, and the like.
Upon configuration, each instance 150 is typically launched for executing a particular primary application 162-1.162-3 (162 generally). The primary application 162 is, for example, a database application, web server application, or other application which the instance 150 is designated for handling on behalf of the user 114. The desired configuration 160 is allocated for supporting the primary application 162 at a particular performance metric (i.e. N number of operations per unit time). However, other changes can mitigate this performance over time. For example, users often install additional applications 170-1 . . . 170-3 (170 generally), such as mail, word processing, and other extraneous applications in addition to the primary application 162. These additional applications 170 impose an extrinsic load on the instance 150 and contribute to a performance shortfall in achieving the performance metric of the primary application that the instance 150 was benchmarked to handle using the initial configuration 160.
The cloud infrastructure 1110 supports one or more instances 150, each residing on a particular server 120 from among one or more servers 120-N in the cloud infrastructure 1110. The instance manager 123 selects one of the instances 150 for cloning of the target application APP0. The cloned instance 150′ is based on a base or ideal configuration best suited for cloning, discussed further below at
Upon approval at step 720, the virtualization manager (VM) 123 (instance manager) clones the existing instance 150′ (step 731), and customizes the added instance 150″ at step 732 for the approved revision of the desired configuration to increase target operations performed by the primary application (SAP, in the example shown). The instance manager 123 then launches the instance 150 as the using the revised configuration at step 734.
As shown by timeline 750, determination of the revised configuration occurs in 5-7 minutes, about 1 second to customize 752, followed by about 1 minute to launch 754 the new cloned instance 150″, illustrating the timely response to a surge of user requests for additional services 122.
In the example above, memory and CPUs are manipulated as infrastructure variables for meeting a number of operations of a target application. Such operations define the units of a BAT (Business Application Throughput), such as the example operations above, for a particular application, hence defining a quantifiable measure of application throughput for a particular application. Discussed below is an extension of the infrastructure variables as a plurality of configurable parameters defining a computing resource. The infrastructure variables may be tuned to meet a BAT demand for a particular application, as discussed further in
Optimal setting of the infrastructure variables is more granular than conventional approaches, in which the physical infrastructure handles such requests with a so-called ‘oversize’ method which leads to certain unnecessary idle resource. However, with virtualization technology two other issues emerge: (1) determination of clear guidance of best practice of VM configuration for certain business application (2) and procedures to resolve ‘competing’ situation raise from multiple VMs 150 on the same physical server 120 hosting different target applications. Configurations herein present an approach to the first issue using a methodology of progressive benchmark measurement for ideal VM configuration for certain business application throughput request, through which a library can be created to connect BAT (operations) to the ideal VM configuration in a cloud infrastructure 1110.
Configurations addressing the second issue employ logic based on the configurations of all VMs 150 on a physical server 120, the logic will calculate two extreme scenarios (a) all the VMs are running 100%. (b) Only the specific VM in concern runs while all other VMs are dormant. This calculation gives a guidance of the theoretical capacity range of a VM while making VM/resource allocation. A program utilizing these solutions can make best decision on both questions of ‘what type of VM to create’ and ‘where to put the VM in the cloud’ and provide elasticity to all mission critical applications.
As shown in
During the allocation of additional resources, the instance manager 123 thus identifies, in the multi-dimensional mapping 1210, an optimal zone 1250 indicative of infrastructure variables that maximize the application throughput. Computing the multidimensional mapping 1250 further includes performing a series of impact analyses, such that each impact analysis is based on a set of infrastructure variables defining available computing resources, and generating, from each impact analysis, a metric result indicative of an application throughput (i.e. operations) attainable with the set of infrastructure variables applicable to the impact analysis.
In further detail, determining the optimal zone 1250 includes varying the infrastructure variables 1120 in successive impact analyses until a plateau occurs in the computed result 1252 indicative of a limit in additional performance. A series of trials varies the infrastructure variables 1120 on each successive impact analysis to identify the optimal zone 1250 of infrastructure variable settings 1120. Additional infrastructure variables 1120 may be applied such that each of the infrastructure variables specifies a quantity of a configuration parameter for a virtualization instance, 150, in which the configuration parameter denoting a type and function of a computing resource provided by the virtualization instance, and results in an additional dimension to the multidimensional mapping 1210.
At a later point in time, the user interface 500 receives an indication of a performance shortfall of at least one virtualization instance 150-N, in which the performance shortfall is based on a computing metric indicative of a measure of computing resources for performing an operation by a primary application 162 for which the virtualization instance 150 is configured to handle, as depicted at step 304. The computing metric includes a benchmark based on the primary application 162 for which the virtualization instance 150 was configured to handle, such that the benchmark defines a quantity of operations per unit time, as shown at step 305, and is usually derived from the primary application vendor using unburdened or dedicated instances, thus avoiding a more realistic assessment provided by the impact analysis. The computing metric employed defines a quantity of the operations per unit time by the primary application 162, and the performance shortfall specifies an additional number of operations per unit time deemed required for overcoming the shortfall, and is also often based on vendor published guidelines, as depicted at step 306. Computing the true performance metric is therefore based on an impact analysis of the computing instantiation 150 for determining computing burdens of additional software 170 installed and executing on the initial computing instantiation.
The reconfigure logic 712 or a related process perform an impact analysis on the current configuration using the determined load metric, wherein the additional computing resource is a virtualization instance 150 having a configuration 160 for handling the identified additional load 540 called for and the extrinsic load, as disclosed at step 307. One particular mechanism for performing the impact analysis is depicted in
The extrinsic computing burden (load) in the example arrangement is based on computing expectations beyond the benchmark of the primary application for which an initial configuration of the virtualization instance is configured to handle, as defined at step 310. Accordingly, the extrinsic computing burden includes computing demands from applications 170 executing on the virtualization instance 150 other than the primary application 162, wherein the initial configuration is based only on the computing demands from execution of the primary application 162 and thus omits other user created loads imposed, as depicted at step 311. The extrinsic computing burden is therefore based on additional computing demands beyond the primary application 162, wherein the initial configuration defines a configuration for supporting a number of operations corresponding to the performance demands at a time of deployment of the initial configuration, as disclosed at step 312.
Upon completion of the impact analysis, the instance manager 132 computes, based on the true computing demand, the hardware resources of the revised configuration, such that the hardware resources are computed based on an association of processors and memory to a quantity of operations, as depicted at step 313. The additional computing resource is defined by a revised configuration and includes a number of processors, a speed for each of the processors and a memory allocation, such that the revised configuration is based on a number of operations of the primary application, as shown at step 314. The revised configuration therefore includes a number of processors, a speed for each of the processors and a memory allocation, in which the configuration based on a number of operations of the primary application (SAPs, transactions, etc. depending on the primary application), as depicted at step 315.
It will be appreciated by those skilled in the art that alternate configurations of the disclosed invention include a multiprogramming or multiprocessing computerized device such as a workstation, handheld or laptop computer or dedicated computing device or the like configured with software and/or circuitry (e.g., a processor as summarized above) to process any or all of the method operations disclosed herein as embodiments of the invention. Still other embodiments of the invention include software programs such as a Java Virtual Machine and/or an operating system that can operate alone or in conjunction with each other with a multiprocessing computerized device to perform the method embodiment steps and operations summarized above and disclosed in detail below. One such embodiment comprises a computer program product that has a computer-readable storage medium including computer program logic encoded thereon that, when performed in a multiprocessing computerized device having a coupling of a memory and a processor, programs the processor to perform the operations disclosed herein as embodiments of the invention to carry out data access requests. Such arrangements of the invention are typically provided as software, code and/or other data (e.g., data structures) arranged or encoded on a non-transitory computer readable storage medium such as an optical medium (e.g., CD-ROM), floppy or hard disk or other medium such as firmware or microcode in one or more ROM, RAM or PROM chips, field programmable gate arrays (FPGAs) or as an Application Specific Integrated Circuit (ASIC). The software or firmware or other such configurations can be installed onto the computerized device (e.g., during operating system execution or during environment installation) to cause the computerized device to perform the techniques explained herein as embodiments of the invention.
Number | Name | Date | Kind |
---|---|---|---|
20130198319 | Shen et al. | Aug 2013 | A1 |
Number | Date | Country | |
---|---|---|---|
20140047437 A1 | Feb 2014 | US |