Cloud computing provides various performance advantages, including the ability to distribute applications into workloads executed across cloud computing systems. However, users (e.g., enterprise users) may often overspend on cloud instances by overprovisioning the instances for the workloads actually being executed, resulting in additional cost and unused compute resources. It may be difficult for users to predict what instances may be best for a particular workload, especially when the workload has not yet been executed on a particular architecture. For example, when first moving a workload from an on premises computing cluster to a cloud computing platform with a different architecture, determining the proper instance for the workload in the cloud computing platform may be challenging.
Examples described herein may profile a workload and predict behavior of the workload on a target computing environment with, in some examples, a target instance type. In some examples, the behavior may be predicted based on one or more key performance indicators for the workload, which key performance indicators may be provided and/or specified by a user. For example, methods may use function approximation to model a relationship between one or more of an application's key performance indicators and the underlying infrastructure or architecture of the computing environment. Methods may use transfer learning to determine how the workload may behave or execute in other computing environments with different instance types or architectures. This transfer learning may be advantageous for example when predicting execution behavior in different architectures. For example, one architecture might be a hyperconverged system where input/output (I/O) is collocated with compute resources (e.g., processors), such that data proximity is important to performance. Another architecture may be a three-tier system, where data is retrieved over a network, so data locality does not affect performance in the same manner. These differences may make it challenging to predict how the workload executing at a hyperconverged architecture may execute at a three-tier architecture. Further, methods may use exploration and exploitation techniques to converge on a recommended or suggested instance configuration size.
Various embodiments of the present disclosure may provide methods to detect candidate cloud instances for relocation (such as virtual machines or container workload instances) automatically. Candidate cloud instances may include inefficient cloud instances which may generally refer to instances of processes which are operating below a threshold level of performance. Various methods may use machine learning to find an optimal environment for the candidate instance based on the workload, minimum acceptable performance, cost minimization, and/or other parameters. For example, methods may fingerprint a workload to provide a fingerprint indicative of particular characteristics of the workload. The fingerprint may be used to select a computing environment and/or instance type based on a model for right sizing the instance. The model may be generated, for example, using a database of performance metrics for various real world workloads executing in various computing environments. For example, a machine learning model may be trained based on data from workloads executing in various computing environments. In this manner, the machine learning model may be used to predict the performance of a particular workload in a particular computing environment. In some implementations, a user may set minimum acceptable performance parameters that may be taken into account when determining a location for a particular workload and/or an optimal instance type.
Various methods described herein may also provide for automatic right sizing of instances. For example, once a recommendation is obtained, a new instance (e.g., virtual machine and/or container) for the workload may be deployed based on the recommendation. The recommendation may take into account multiple instance types across candidate computing environments and architectures. For example, if a workload at a first cloud environment is very costly, the methods may suggest a similar instance on a second, less expensive, cloud environment.
Examples of systems and methods described herein may allow users to utilize available cloud resources and ensure workloads run smoothly by bursting instances of the workload to the cloud using spot instances. For example, a method may profile a workload and its key performance indicators and use game theory to burst instances of the workload to the cloud using spot instances. The profile may include whether a workload is stateful or stateless. Example methods may utilize a profile to identify resource requirements and predict surges in demand based on, for example, past demand patterns captured in the profile. If current instances of the workload do not have enough capacity, the method may determine a bid price to obtain additional spot instances at cloud computing platforms to provide additional capacity. The bid price may be determined using game theory. The system may also include a networking domain spanning the various computing platforms over a transit gateway and a program load balancer.
Various embodiments of the present disclosure will be explained below in detail with reference to the accompanying drawings. Other embodiments may be utilized, and structural, logical and electrical changes may be made without departing from the scope of the present disclosure.
The network 110 may include any type of network capable of routing data transmissions from one network device (e.g., of the computing clusters 112, the central computing system 106, and/or the cloud computing system 114) to another. For example, the network 110 may include a local area network (LAN), wide area network (WAN), intranet, or a combination thereof. The network 110 may include a wired network, a wireless network, or a combination thereof.
Each of the computing clusters 112 may be hosted on a respective computing cluster platform having multiple computing nodes (e.g., each with one or more processors, volatile and/or non-volatile memory, communication or networking hardware, input/output devices, or any combination thereof) and may be configured to host additional resources, such as virtual machines (VMs), platform as a service (PaaS) software stacks, containers, virtualization managers and/or other resources. Each of the cloud computing systems 114 may be hosted on a respective public or private cloud computing platform (e.g., each including one or more data centers with a plurality of computing nodes or servers having processor(s), volatile and/or non-volatile memory, communication or networking hardware, input/output devices, or any combination thereof) and may be configured to host respective virtual machines, containers, virtualization managers, software stacks, or other various types of software components. Examples of computing platforms may include any one or more of a computing cluster platform, a bare metal system platform, or a cloud computing platform. Examples of service domains may be instantiated on any of the computing clusters 112 or 118 or the cloud computing systems 114 and 116. Software located at any of the computing platforms may include instructions that are stored on a computer readable medium (e.g., memory, storage, disks, etc.) that are executable by one or more processors (e.g., central processor units (CPUs), graphic processor units (GPUs), tensor processing units (TPUs), hardware accelerators, video processing units (VPUs), etc.) to perform functions, methods, etc., described herein.
The manager 108 hosted on the central computing system 106 may centrally manage the computing platforms, including monitoring virtual machines, containers, and workloads executing at the computing platforms. For example, the manager 108 may monitor or receive information about resource usage of various workload instances (which may be, for example, virtual machines or containers). The manager 108 may also, in some implementations, configure instances of workloads, create new instances of workloads, or consolidate instances of workloads based on the resource usage of the workloads. Such actions may be taken in response to a request from a user, after consent from a user, or automatically based on user settings. Users may include individuals (e.g., system administrators, consumers, customers), enterprises, and/or other processes or instances. The central computing system 106 may include one or more computing nodes configured to host the manager 108. The central computing system 106 may include a cloud computing system and the manager 108 may be hosted in the cloud computing system and/or may be delivered/distributed using a software as a service (SaaS) model, in some examples. In some examples, the manager 108 may be distributed across a cluster of nodes of the central computing system 106.
In some examples, an administrative computing system 102 may host a manager interface 104. The manager interface 104 may facilitate user or customer communication with the manager 108 to control operation of the manager 108. The manager interface 104 may include a graphical user interface (GUI), APIs, command line tools, etc., that each may facilitate interaction between one or more users and the manager 108. The manager interface 104 may provide an interface that allows a user to monitor and configure virtual machines, containers, and/or workloads at each of the computing platforms managed by the manager 108. The manager interface 104 may also provide a view of all instances at computing platforms managed by the manager 108, regardless of the vendor, architecture, or other differences associated with the computing platforms. The manager interface 104 may also allow a user to select preferences for right sizing of workload instances across computing platforms, specify performance criteria for specific workloads, and respond to requests from the manager 108 regarding recommended resource provisions for various workloads.
In various implementations, the system shown in
The architectures of
Each host machine 202, 206, 204 may run virtualization software, such as VMWARE ESX(I), MICROSOFT HYPER-V, or REDHAT KVM. The virtualization software includes 230, 232, and 234 to create, manage, and destroy user VMs, as well as managing the interactions between the underlying hardware and user VMs. User VMs may run one or more applications that may operate as “clients” with respect to other elements within clustered virtualization environment 200. Though not depicted in
CVMs 224, 226, and 228 may be used in some examples to manage storage and input/output (“I/O”) activities according to particular embodiments. These controller VMs may act as the storage controller in the currently described architecture. Multiple such storage controllers may coordinate within a cluster to form a unified storage controller system. CVMs may run as virtual machines on the various host machines, and work together to form a distributed system that manages all the storage resources, including local storage, network-attached storage 210, and cloud storage 208. The CVMs may connect to network 254 directly, or via a hypervisor. Since the CVMs run independent of hypervisors 230, 232, 234, this means that the current approach can be used and implemented within any virtual machine architecture, since the CVMs of particular embodiments can be used in conjunction with any hypervisor from any virtualization vendor. In some examples, however, CVMs may not be used, and the hypervisors may perform the functions attributed to the CVMs.
A host machine may be designated as a leader node within a cluster of host machines. For example, host machine 204, as indicated by the asterisks, may be a leader node. A leader node may have a software component designated to perform operations of the leader. For example, CVM 226 on host machine 204 may be designated to perform such operations. A leader may be responsible for monitoring or handling requests from other host machines or software components on other host machines throughout the virtualized environment. If a leader fails, a new leader may be designated. In particular embodiments, a management module (e.g., in the form of an agent) may be running on the leader node.
Each CVM 224, 226, and 228 may export one or more block devices or NFS server targets that appear as disks to user VMs 212, 214, 216, 218, 220, and 222. These disks are virtual, since they are implemented by the software running inside CVMs 224, 226, and 228. Thus, to user VMs, CVMs appear to be exporting a clustered storage appliance that contains some disks. User data (including the operating system) in the user VMs may reside on these virtual disks.
Significant performance advantages can be gained by allowing the virtualization system to access and utilize local storage 236, 238, and 240 as disclosed herein. This is because I/O performance is typically much faster when performing access to local storage as compared to performing access to network-attached storage 210 across a network 254. This faster performance for locally attached storage can be increased even further by using certain types of optimized local storage devices, such as SSDs. Accordingly, the manager 108 may suggest instantiation of, or automatically instantiate, instances of workloads that heavily utilize data at local storage 236, 238, and 240 at the hyperconverged cluster 200. Workloads instantiated at the hyperconverged cluster 200 may be, in various examples, user VMs, containers, or other abstractions providing access to resources of the hyperconverged cluster 200.
The computing node 300 includes a communications fabric 322, which provides communication between one or more processor(s) 312, memory 314, local storage 302, communications unit 320, and/O interface(s) 310. The communications fabric 322 can be implemented with any architecture designed for passing data and/or control information between processors (such as microprocessors, communications and network processors, etc.), system memory, peripheral devices, and any other hardware components within a system. For example, the communications fabric 322 can be implemented with one or more buses.
The memory 314 and the local storage 302 are computer-readable storage media. In this embodiment, the memory 314 includes random access memory RAM 316 and cache 318. In general, the memory 314 can include any suitable volatile or non-volatile computer-readable storage media. In an embodiment, the local storage 302 includes an SSD 304 and an HDD 306. The memory 314 may hold computer readable instructions, files, data, etc., for execution by one or more of the processors 312 of the computing node 300. For example, the memory 314 includes manager instructions 324 which, when executed by the processors 312 implement the manger 108 of the central computing system 106. Where the computing node 300 is implemented as a node in a computing platform or cluster monitored by the central computing system 106, the memory 314 may hold computer readable instructions for workloads instantiated at the respective computing platform by the manager 108.
Various computer instructions, programs, files, images, etc., may be stored in local storage 302 for execution by one or more of the respective processor(s) 312 via one or more memories of memory 314. In some examples, local storage 302 includes a magnetic HDD 306. Alternatively, or in addition to a magnetic hard disk drive, local storage 302 can include the SSD 304, a semiconductor storage device, a read-only memory (ROM), an erasable programmable read-only memory (EPROM), a flash memory, or any other computer-readable storage media that is capable of storing program instructions or digital information.
The media used by local storage 302 may also be removable. For example, a removable hard drive may be used for local storage 302. Other examples include optical and magnetic disks, thumb drives, and smart cards that are inserted into a drive for transfer onto another computer-readable storage medium that is also part of local storage 302.
Communications unit 320, in some examples, provides for communications with other data processing systems or devices. In these examples, communications unit 320 includes one or more network interface cards. Communications unit 320 may provide communications through the use of either or both physical and wireless communications links. For example, the communications unit 320 may provide connection to the network 110, allowing for communication with the admin computing system, the service domains managed by the manager 108, and other locations.
I/O interface(s) 310 allow for input and output of data with other devices that may be connected to a computing node 300. For example, I/O interface(s) 310 may provide a connection to external devices such as a keyboard, a keypad, a touch screen, and/or some other suitable input device. External devices can also include portable computer-readable storage media such as, for example, thumb drives, portable optical or magnetic disks, and memory cards. Software and data used to practice embodiments of the present disclosure can be stored on such portable computer-readable storage media and can be loaded onto local storage 302 via I/O interface(s) 310. I/O interface(s) 310 may also connect to a display 308.
Display 308 provides a mechanism to display data to a user a may be, for example, a computer monitor. In some examples, a GUI associated with the Manager interface 104 may be presented on the display 308.
While not shown, in some examples, computing node(s), such as computing node 300 may be configured to execute a hypervisor, a controller virtual machine (VM) and one or more user VMs. The user VMs may be virtual machine instances executing on the computing node 300. The user VMs may share a virtualized pool of physical computing resources such as physical processors (e.g., hardware accelerators) and storage (e.g., local storage, cloud storage, and the like). The user VMs may each have their own operating system, such as Windows or Linux. Generally any number of user VMs may be implemented. User VMs may generally be provided to execute any number of applications which may be desired by a user and may be workloads instantiated and managed by the manager 108 of the central computing system 106. The computing node 300 may also be configured to execute containers or other abstractions as workloads managed by the manager 108.
The central computing system 430 may be any type of computing device including one or more processors 426 and memory 428, and may be implemented using any of the devices or methods discussed with respect to the central computing system 106. For example, the central computing system 430 may be implemented by the computing node 300. The memory 428 may store instructions which, when executed by the processors 426 implement the manager 434, including the workload manager 432. The memory 428 may further store various models and data used and/or created (e.g., trained) by the workload manager 432, such as a provisioning model 436, bursting model 438, workload data 440, and computing environment data 442. In some implementations, the data and models may be stored at another storage location local to the central computing system 430 or otherwise accessible by the central computing system 430.
The workload manager 432 may monitor and configure workloads managed by the workload manager 432 in various computing environments. In some examples, the workload manager 432 may provide a suggested or recommended instance for a workload in a specific computing environment based on performance indicators of the workload, which performance indicators may be generated based on execution of the workload at a second computing environment. A suggested instance may include a suggested resource allocation (e.g., size of the workload) in the computing environment to meet performance standards for the workload while reducing inefficient use of computing resources.
The workload manager 432 may also continually monitor multiple workloads at multiple computing environments to identify workloads that may be overprovisioned (e.g., allocated more computing resources than used), underprovisioned (e.g., not allocated enough computing resources to maintain desired performance), or constrained. The workload manager 432 may, in some implementations, automatically resize workloads that are overprovisioned or underprovisioned and may create additional instances of workloads based on the workload's demand patterns. The workload manager 432 may also act as a load balancer, ensuring that workloads and/or individual instances of workloads are distributed across available computing environments. For example, the workload manager 432 may identify workloads which may execute more efficiently in an alternate computing environment. The workload manager 432 may leverage computing environment data 442, workload data 440, the bursting model 438, and the provisioning model 436 to monitor workloads, suggest instances of workloads in specific computing environments, resize workloads, create new instances of workloads, and perform other functions.
The computing environment data 442 may include various information about the computing environments managed by the manager 434, including the cloud environment 402 and the cloud environment 404. A computing environment may be managed by the manager 434 where the manager 434 is in communication with the computing environment and has the ability to create and configure instances of workloads at the computing environment. The manager 434 may manage a portion of a computing environment, such as managing reserved instances in a public cloud environment. In various embodiments, data stored as computing environment data 442 may include total compute resources available, total compute resources utilized, number and type of reserved instances at the computing environment, price to obtain a spot instance at the computing environment, fingerprints of workloads executing at the computing environment, architecture of the computing environment, security features of the computing environment, available configurations for virtual machines in the computing environment, and available configurations for containers in the computing environment, among other data. The computing environment data 442 may include data for computing environments available for use by the manager 434, even where no workloads managed by the workload manager 432 are actively running at the computing environment. Such data may be stored using various types of data structures and may be distributed across multiple physical memory or storage locations of the central computing system 430 or storage locations accessible by the central computing system 430.
The workload data 440 may include various information about the workloads of applications managed by the workload manager 432. For example, workload data 440 may include characteristics of the workload (e.g., whether the workload is stateful or stateless), demand patterns of the workload, a fingerprint of the workload, performance indicators of the workload in various computing environments, location of data used by the workload, security or privacy parameters, existing instances of the workload, available configurations of the workload (e.g., container or VM), etc. The workload data 440 may include data for workloads executing in the computing environments managed by the manager 434 and other workloads managed by the workload manager 432. In some implementations, the workload data 440 may include data for workloads not managed by the workload manager 432 that may be used by the workload manager 432, for example, in generating models or otherwise providing comparison to workloads managed by the workload manager 432. The workload data 440 may be stored using various types of data structures and may be distributed across multiple physical memory or storage locations of the central computing system 430.
The workload manager 432 may provide performance indicators of a workload in a first computing environment (e.g., a hyperconverged infrastructure environment) to the provisioning model 436, receiving one or more suggested instances for the workload from the provisioning model 436 (e.g., a suggestion to move the workload from the hyperconverged infrastructure environment to a 3-tier architecture environment). For example, the workload manager 432 may provide current resource allocation, I/O request volume, and/or workload execution speed at the first computing environment to the provisioning model 436. The provisioning model 436 may be a trained neural network that may take performance indicators of a workload in one architecture as input and output expected performance indicators of the workload in another environment. The manager 434 may accordingly identify another configuration for the workload based on the expected performance indicators for the workload in other environments. The workload manager 432 may receive suggested instance configurations for the workload at the first computing environment and/or additional computing environments, including computing environments with different underlying architecture than the first computing environment. In some implementations, the workload manager 432 may provide a target second computing environment to the provisioning model 436, such that the workload manager 432 receives suggested instances for the workload at the second computing environment as output from the provisioning model 436.
The provisioning model 436 may be implemented using various machine learning, artificial intelligence, or other models, including various combinations of models. In various implementations, the provisioning model 436 may include a classifier using an algorithm, such as a k nearest neighbors (kNN) algorithm or clustering to classify a workload whose performance indicators are provided as input to the provisioning model 436. Classifiers of the provisioning model 436 may be trained using the workload data 440 and/or datasets about other workloads and their respective execution environments. Classifiers of the provisioning model 436 may, in some implementations, be used to identify similar workloads or a type of the workload (e.g., stateful or stateless), which may be used by the workload manager 432 to suggest a configuration for an instance of the workload at one or more computing environments.
In various implementations, the provisioning model 436 may include a neural network trained using the workload data 440. A neural network of the provisioning model may receive performance indicators as input and provide a suggested instance of the workload as output (e.g. a suggested computing environment and/or configuration for the workload based on expected performance of the workload in that environment and/or configuration). In some instances, neural networks of the provisioning model 436 may generate a fingerprint for a workload based on the workload data 440, where the fingerprint forms a representation of the workload used for comparison to other workloads. Such fingerprints may be provided to the workload manager 432, stored with the workload data 440, or used by other models of the provisioning model 436 as input. The fingerprint may be used by the neural network that generated the fingerprint to, for example, identify other workloads with similar fingerprints, where data about the similar workloads may be used to suggest an instance for the workload.
The provisioning model 436 may include one model providing suggested instances for various computing environments managed by the manager 434 and/or individual models for individual computing environments or architecture types for computing environments managed by the manager 434. For example, the provisioning model 436 may include a model specific to a computing environment, which may be utilized to provide a suggested instance in that particular computing environment. The provisioning model 436 may include a model including multiple computing environments to identify a suitable computing environment for a workload and to provide a suggested configuration for an instance of the workload at the identified computing environment. In some implementations, the workload manager 432 may use multiple models of the provisioning model 436, each configured to provide suggested configurations for one computing environment, to provide multiple suggested instances at multiple computing environments.
The workload manager 432 may provide application information (e.g., performance indicators for each workload or instance of an application, demand patterns of the application, and/or additional information) to the bursting model 438 to obtain suggested configurations for spot instances of workloads of the application. An application may include multiple workloads, and each workload may be performed by one or more instances of the workload (e.g., individual virtual machines or containers executing at various computing environments). As demand for applications may vary, some workloads may benefit from spot instances, which may be additional instances of the workload deployed temporarily to compensate for additional demand. For example, a workload receiving user requests to be processed by other workloads of an application may use spot instances at high traffic times of day. The workload manager 432 may utilize the bursting model to identify workloads that may benefit from spot instances and to provide configurations for suggested spot instances.
In various implementations, the bursting model 438 may be trained using the computing environment data 442 (e.g., spot instance costs and available instance types) and the workload data 440 (e.g., performance indicators and demand data) to suggest and/or create temporary or spot instances of a workload, which may be referred to as “bursting” a workload. The bursting model 438 may use game theory, reinforcement learning, imitation learning or other methods to suggest, bid for, and/or create spot instances. For example, the workload manager 432 may provide application demand or usage patterns, performance indicators for instances of the application, and configuration of workloads included in the application to the bursting model 438. The bursting model 438 may then identify a workload that may benefit from a spot instance based on the demand pattern of the application. For example, a workload where instances are underprovisioned or close to underprovisioned at high demand times but underprovisioned at low demand times may benefit from one or more spot instances. The bursting model 438 may then recommend a spot instance for the workload to the workload manager 432. In some implementations, the bursting model 438 may select a recommended spot instance from pre-configured reserved instances available to the workload manager 432. The bursting model 438 may use similar processes as those described with respect to the provisioning model 436 to provide a suggested instance for a workload based on workload performance indicators. The bursting model 438 may, in various implementations, provide performance indicators to the provisioning model 436 along with additional information (e.g., a constraint to return only spot instances) and may receive suggested spot instances from the provisioning model 436 for communication to the workload manager 432.
Though the cloud environment 402 and cloud environment 404 are shown in
Workload instances (e.g., workload instances 408 and 414) generally execute at a host machine through a host interface to access computing resources (e.g., memory and processing) of the host machine and/or computing resources available to the host machine, such as processors of other computing nodes in a cluster of computing nodes or processors or a cloud computing environment. For example, workload instances may be configured as virtual machines running on a hypervisor acting as a host or as containers running on a docker acting a host. The workload instance may access compute resources to execute tasks of the workload instance, which may be processors or other compute elements accessible by the host machine. Some workloads may also access data storage (e.g., persistent data storage) to store data persisting across requests to the workload.
The cloud environment 402 is shown having a tiered architecture, where a workload instance 408 runs on a host 406 with access to compute resources 422. Data (e.g., pipeline data) is stored at data storage 410, which is accessible to the compute resources 422 via a network 412. The workload instance 408 may be configured as a virtual machine, a container, or other abstraction providing allocation of computing resources in the cloud environment 402. The cloud environment 404 is shown having a hyperconverged architecture, where a workload instance 414 runs on a host 416 with access to co-located compute resources 418 (e.g., processors executing tasks of the workload instance 414) and data storage 420. As in the cloud environment 402, the workload instance 414 may be configured as a virtual machine, a container, or other abstraction providing allocation of computing resources in the cloud environment 404. Where the workload instance 414 is a virtual machine, the host 416 is a hypervisor providing access to the compute resources 418 and the data storage 420. Where the workload instance 414 is a container, the host 416 is a docker engine providing access to the compute resources 418 and the data storage 420 of the cloud environment 404.
Some workloads may access data from data storage during execution (e.g., for data analysis) or use backing data to store a state of the application or data between requests to the workload or application. These workloads may be referred to as stateful workloads. A stateful workload may execute more efficiently (e.g., process a request more quickly or with fewer steps) in a hyperconverged architecture than in a tiered architecture because the host 406 of the tiered architecture does not provide direct access to data storage 410 by the workload instance 408 executing on the host 406 and a call over a network 412 is used to access data storage 410. The provisioning model 436 and the bursting model 438 may be configured (e.g., trained) to account for these differences when suggesting instances for workloads and may, in some implementations, classify workloads as stateful or stateless based on the performance indicators of the workload. The classification of a workload as stateful or stateless may be used to select a computing environment where one is not pre-selected for the workload.
At block 502 the workload manager 432 monitors performance indicators of a workload executing in a first computing environment. Performance indicators may include, for example, processor usage, number or volume of I/O requests, classification of the workload as stateful or stateless, runtime, memory usage, usage and demand patterns, and/or other performance measurements. In some implementations, the computing environments monitored by the manager 434 may automatically transmit performance indicators for workloads managed by the workload manager 432 to the workload manager 432 at specified time intervals (e.g., every hour, every five minutes, etc.). In some implementations, the workload manager 432 may transmit a request for performance indicators of a workload or multiple workloads as needed or at specified time intervals.
When the workload manager 432 receives the performance indicators, the workload manager 432 may update the workload data 440, the computing environment data 442, the provisioning model 436, and/or the bursting model 438 to include the performance indicators. In some implementations, the workload manager 432 may update the workload data 440 using the performance indicators. For example, the performance indicators may be added to the structures of the workload data 440 and the computing environment data 442, which may be used, in some situations to re-train the provisioning model 436 and/or the bursting model 438 or to train new models for use by the workload manager 432. The performance indicators, or a subset of the performance indicators, may also be provided to the provisioning model 436 and/or the bursting model 438 in a feedback loop to improve performance of the provisioning model 436 and the bursting model 438. For example, the performance indicators may be provided to the provisioning model through a feedback loop where the provisioning model generated the configuration for the instance of the workload in the first computing environment to improve recommendations produced by the provisioning model 436.
The workload manager 432 may identify the workload as overprovisioned, underprovisioned, or constrained. Based on the identification, the workload manager 432 may provide the performance indicators of the workload to the provisioning model 436 and/or the bursting model 438 to identify additional or alternative instances of the workload to right-size the workload. The workload manager 432 may also provide the performance indicators to the provisioning model based on a user request relating to the workload—such as a request to move the workload to an alternate computing environment managed by the manager 434.
The provisioning model 436 and/or the bursting model 438 may be used by the manager 434 and/or workload manager 432 to generate a fingerprint of the workload based on the performance indicators. A fingerprint for a workload may be a general representation of the workload and may, in some instances, incorporate performance indicators for the workload in multiple time increments and across various computing environments. Fingerprints may dimensionally reduce performance indicators to facilitate comparison between different workloads, which may be executing in different computing environments. In some implementations, the provisioning model 436 may be used to provide workload fingerprints to the workload manager 432 for use by the workload manager 432 and/or the manager 434.
At block 504, the provisioning model 436 may be used (e.g., by the manager 434 and/or workload manager 432) to identify one or more comparable workloads based on the performance indicators in the first computing environment. For example, the workload manager 432 may utilize the provisioning model 436 to classify the workload. In some examples the workload manager 432 may use a neural network to identify similar workloads. Other methods, such as clustering or transfer learning may also be used by the workload manager 432 and/or the manager 434 to identify similar workloads using the provisioning model 436. A similar workload may generally use a similar amount of resources, have a similar volume or number of I/O requests, have a similar usage pattern, and/or have other similar characteristics. In some implementations, the workload manager 432 may utilize the provisioning model 436 to attempt to identify similar workloads that have had instances in both the first computing environment and additional computing environments. In some instances, the provisioning model 436 may be used to identify similar workloads based on the fingerprint of the workloads, without reference to previously or currently executing instances of the workloads.
At block 506, the workload manager 432 and/or the manager 434 generate a suggested resource allocation for the workload in a second architecture based on characteristics of the one or more comparable workloads. In some examples, the workload manager 432 and/or the manager 434 may utilize the provisioning model 436 to generate the suggested resource allocation. Resource allocation may include, in some examples, memory allocation, number and types of processors available, and size of the instance. The second architecture may be defined as a specific computing environment. For example, an administrative user may specifically request an instance of the workload at a specific three-tiered cloud environment. Where the suggested instance is constrained, the workload manager 432 may identify, within the provisioning model 436, instances of comparable workloads executing at the specific three-tiered cloud environment, or another three-tiered cloud environment with similar characteristics, as a baseline for the suggested resource allocation for the workload. Additionally or instead, the workload manager 432 may use the computing environment data 442 or a subset of the computing environment data 442 and the provisioning model 436 to select among available pre-configured instances in the second computing environment or to acquire additional data about available compute resources at the second computing environment.
Where the suggested instance is not constrained by a specific computing environment, the workload manager may utilize the provisioning model 436 to identify particular (e.g., the most efficient) instances of comparable workloads in generating a suggested resource allocation. An efficient instance may be defined by the user but may be, for example, an instance allocated enough computing resources to meet performance parameters of the workload while costing the least and/or using the least amount of computing resources. Other factors may be used by the workload manager 432, either explicitly or implicitly, in generating a suggested resource allocation. For example, the provisioning model 436 may incorporate, either implicitly or explicitly, information about instances of the workload already executing, load balancing, available reserved instances, location of data used by the workload, security parameters, etc.
The suggested resource allocation may include, for example, a suggested computing environment, a suggested pre-configured instance (e.g., a vendor defined “large” instance with a defined allocated amount of compute resources), number of instances, a suggested custom instance, duration of the instance (e.g., whether the instance is configured as a spot instance), instance type (e.g., VM or container), etc. In some implementations, the workload manager 432 may further automatically create an instance with the suggested resource allocation at the second computing environment. When creating an instance with the suggested resource allocation, the workload manager 432 may terminate the workload at the first computing environment (e.g., to right size an instance at the first computing environment) or may create the instance at the second computing environment in addition to the workload at the first computing environment.
At block 604, the workload manager 432 detects a subset of workloads of the plurality of workloads for reconfiguration based on the computing resource usage of the plurality of workloads. The subset of workloads may be, for example, inefficient workloads identified by comparing the computing resource usage of the plurality of workloads to computing resources allocated to the plurality of workloads. Inefficient workloads may be, for example, overprovisioned workloads allocated more processing and memory resources than used by the workload. The workload manager 432 may also detect underprovisioned workloads that could benefit from additional processing and memory resources, constrained workloads, or bully workloads. The workload manager 432 may identify inefficient workloads based on a comparison between resource usage of the workload and resources allocated to the workload. For example, a workload may be overprovisioned when the difference between the resources allocated to the workload and the resource usage of the workload remains above a threshold value for a period of time reflected by the performance indicators. The workload manager 432 may also identify workloads that could be executed at alternative computing environments to improve performance and/or save resources. In some implementations, the workload manager 432 may also identify workloads that are likely to become inefficient based on, for example, usage or demand patterns for the application of the workload. Accordingly, the workload manager 432 may search for an configure spot instances of the workload to be used at some future time to keep the workload from becoming inefficient. This may be referred to as bursting the workload.
At block 606, the workload manager 432 generates, for each of the subset of workloads, a suggested instance for the workload based on the computing resource usage. The workload manager 432 may generate the suggested instance for each of the inefficient workloads using some or all of the steps described in
The suggested instance generated at block 606 may be suggested to replace the currently executing inefficient workload or may be suggested in addition to the currently executing workload. For example, the workload manager 432 may identify an instance that is overprovisioned. In this case, the workload manager 432 may suggest reconfiguring the current instance or replacing the current instance with the suggested smaller instance. In other examples, such as when the workload manager 432 generates suggested spot instances for bursting the workload, the suggested instances may be in addition to an instance that could become underprovisioned or otherwise inefficient without the additional instances.
When generating a suggested spot instance, the workload manager 432 may utilize the bursting model 438, the provisioning model 436, the workload data 440, and/or the computing environment data 442. For example, the workload manager 432 may access the workload data 440 to determine the usage pattern for the workload and the computing environment data 442 to determine which computing environments have available spot instances as well as pricing for the spot instances. The workload manager 432 may further use the bursting model 438 to determine resource allocation for the spot instances, where the bursting model 438 may, in some instances, utilize the provisioning model 436.
At block 608, the workload manager 432 right-sizes each of the detected inefficient workloads by creating an instance of each workload of the subset of workloads using the suggested instance or instances generated at block 606. In some implementations, the workload manager 432 may continue to monitor workloads and the steps of
The user interface shown in
While certain components are shown in the figures and described throughout the specification, other additional, fewer, and/or alternative components may be included in the multi-cloud computing system 100 or other computing systems. Such additional, fewer, and/or alternative components are contemplated to be within the scope of this disclosure.
This application claims priority to Provisional Application No. 63/115,447, filed on Nov. 18, 2020. The aforementioned application is incorporated herein by reference, in its entirety, for any purpose.
Number | Date | Country | |
---|---|---|---|
63115447 | Nov 2020 | US |