Cloud computing refers to the access and/or delivery of computing services and resources, including servers, storage, databases, networking, software, analytics, and intelligence, over the Internet (“the cloud”). For instance, a database in a cloud computing environment can include clusters of servers for hosting virtual machines. A cloud computing platform may make such a database available for users and/or applications. The cloud computing platform can allocate virtual machines to clusters of a datacenter to perform various tasks on behalf of the users and/or applications. In order to allocate a new virtual machine to the clusters, an allocation system selects a server from a cluster and allocates the virtual machine to the selected server. Cloud providers allow their users to consume cloud computing resources through various infrastructure as a service (IAAS), software as a service (SAAS), and platform as a service (PAAS) offerings. With the increase in such cloud computing resources, the number of servers in clusters and clusters in the cloud computing inventory is scaled to support users of the cloud.
This Summary is provided to introduce a selection of concepts in a simplified form that are further described below in the Detailed Description. This Summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used to limit the scope of the claimed subject matter.
Embodiments are described herein for search space pruning for virtual machine allocation. In an aspect of the present disclosure, an allocation request for allocating a virtual machine to a plurality of clusters is received. A valid set of clusters is generated. The valid set of clusters includes clusters of the plurality of clusters that satisfy the allocation request. An attribute associated with the allocation request is identified. A trained search space classification model is utilized to determine a truncation parameter based at least on the identified attribute. The valid set of clusters is filtered based on the truncation parameter. A server is selected from the filtered valid set of clusters. The virtual machine is allocated to the selected server.
In a further aspect, the trained search space classification model is a rule-based model that sets a rule for inferring a truncation parameter based on the identified attribute.
In a further aspect, the trained search space classification model is a machine learning model that infers truncation parameters in near-real time based on the identified attribute.
In another aspect, a search space pruner trains the search space classification model. The search space pruner receives telemetry data and generates an analysis summary based on an analysis of the telemetry data. The search space pruner trains the classification model to determine truncation parameters based on the analysis summary. The search space pruner provides the trained search space classification model to the allocator.
In a further aspect, the search space pruner determines an expected allocation time using the trained search space classification model. The search space pruner determines the expected allocation time has a predetermined relationship with a threshold. The search space pruner retrains the trained search space classification model.
The accompanying drawings, which are incorporated herein and form a part of the specification, illustrate embodiments and, together with the description, further serve to explain the principles of the embodiments and to enable a person skilled in the pertinent art to make and use the embodiments.
The subject matter of the present application will now be described with reference to the accompanying drawings. In the drawings, like reference numbers indicate identical or functionally similar elements. Additionally, the left-most digit(s) of a reference number identifies the drawing in which the reference number first appears.
The following detailed description discloses numerous example embodiments. The scope of the present patent application is not limited to the disclosed embodiments, but also encompasses combinations of the disclosed embodiments, as well as modifications to the disclosed embodiments. It is noted that any section/subsection headings provided herein are not intended to be limiting. Embodiments are described throughout this document, and any type of embodiment may be included under any section/subsection. Furthermore, embodiments disclosed in any section/subsection may be combined with any other embodiments described in the same section/subsection and/or a different section/subsection in any manner.
Databases in cloud computing environments can include clusters of servers. Virtual machines are allocated to the clusters to perform tasks in workloads. In order to allocate a new virtual machine to the cluster, an allocation system (e.g., an “allocator”) selects a server and allocates the virtual machine to the selected server. Depending on the implementation, the allocator may determine the (e.g., best) server for hosting the virtual machine. With the increase in use of cloud databases, the number of servers in clusters and clusters in a database is scaled accordingly, and therefore the number of servers an allocator may consider when allocating a virtual machine also increases. The clusters and servers considered by the allocator is referred to as a “search space.” As the search space increases in size, the time and amount of compute resources used to allocate a virtual machine also increases.
In order to manage an increased search space, allocators may implement search space pruning techniques to selectively reduce the number of clusters and/or servers considered when allocating a virtual machine to a server. For example, an allocator in accordance with an implementation utilizes techniques for reducing the number of clusters to consider (also referred to as “early search space pruning”) and techniques for selecting a server from the reduced set of clusters (also referred to as “late search space pruning”). Early search space pruning enables an allocator to reduce many candidate servers at a time by filtering out clusters (which comprise candidate servers) from consideration.
Embodiments of the present disclosure are directed to allocating virtual machines to servers in cloud computing environments. In particular, techniques described herein implement early search space pruning for virtual machine allocation. For example, an allocator in accordance with an embodiment receives a request to allocate a virtual machine to clusters in a cloud computing environment. The allocator generates a valid set of clusters that includes clusters of the cloud computing environment that satisfy the allocation request (e.g., by pruning clusters ineligible for the allocation request (e.g., clusters that do not have resources available to fulfill the request, clusters without capacity to fulfill the request, a specialized cluster that is not allowed to host a general purpose virtual machine, and/or a cluster that is otherwise ineligible to satisfy the allocation request)). The allocator identifies an attribute associated with the request. The allocator uses a trained search space classification model to determine a truncation parameter based on the identified attribute. The allocator filters the valid clusters in the cloud computing environment based on the determined truncation parameter. In this context, the allocator truncates (i.e., reduces) the number of eligible clusters considered for allocation of the virtual machine. Thus, the time taken and/or the amount of compute resources used to select a server from servers in the eligible clusters to allocate the virtual machine to is reduced.
Furthermore, as discussed further herein, the search space classification model in accordance with one or more embodiments is trained to improve scalability (e.g., by increasing the number of requests an allocator can handle, by decreasing the time taken to allocate a virtual machine, by decreasing the amount of compute resources used to allocate a virtual machine, and/or by otherwise improving an allocator in a manner that enables the allocator to allocate virtual machines for an increased number of users) and/or allocation quality (e.g., how well a virtual machine is allocated within a server (also referred to as “virtual machine packing” quality)). Therefore, by using the search space classification model to determine a truncation parameter, some embodiments of allocators described herein are able to allocate virtual machines for many users and/or applications while improving allocation quality. Moreover, improving scalability, allocation quality, or both scalability and allocation quality enables embodiments described herein to reduce allocation failures, to improve customer experience, to increase useable (e.g., sellable) capacity in a cloud computing platform, and/or to otherwise improve the utilization of clusters in a cloud computing environment to host virtual machines.
Virtual machines may be allocated to clusters in a cloud computing environment by implementing search space pruning techniques in various ways, in embodiments. For instance,
Server infrastructure 104 may be a network-accessible server set (e.g., a cloud-based environment or platform). As shown in
In accordance with an embodiment, one or more of clusters 112A-112n are co-located (e.g., housed in one or more nearby buildings with associated components such as backup power supplies, redundant data communications, environmental controls, etc.) to form a datacenter. In accordance with another embodiment, clusters of clusters 112A-112n are located in multiple datacenters in a distributed collection of datacenters. However, clusters 112A-112n may be arranged in other manners, as would be understood by a person of ordinary skill in the relevant art(s) having benefit of this disclosure. For example, clusters 112A-112n in accordance with an embodiment are an “inventory” of a cloud database. In this context, the inventory is arranged in a hierarchy of regions, availability zones, and datacenters. Each region comprises one or more availability zones, each availability zone comprises one or more datacenters, and each datacenter comprises one or more clusters of clusters 112A-112n.
Each of server(s) 114A-114n and 116A-116n may comprise one or more server computers and/or server systems. Each of server(s) 114A-114n and 116A-116n may be configured to execute one or more software applications (or “applications”) and/or services and/or manage hardware resources (e.g., processors, memory, etc.), which may be utilized by users (e.g., customers) of the network-accessible server set. For example, server(s) 114A-114n and/or 116A-116n in accordance with an embodiment are configured to host virtual machines. In accordance with another embodiment, servers(s) of servers 114A-114n and/or 116A-116n are configured for specific uses. For example, any of servers 114A-114n and/or 116A-116n may be configured to execute services of allocator 106, search space pruning system 108, and/or tracking system 110 (or one or more components and/or subservices thereof). It is noted that allocator 106, search space pruning system 108, and/or tracking system 110 may be incorporated as service(s) on a computing device external to clusters 112A-112n and/or server infrastructure 104.
In accordance with an embodiment wherein clusters 112A-112n are an inventory of a cloud database, servers within an availability zone are heterogenous (e.g., servers across multiple hardware generations and stock keeping unit (SKU) configurations, including special servers for high performance computing (HPC) applications, graphics processing unit (GPU) applications, etc.). Servers in a particular availability zone may include servers from multiple generations (e.g., servers being decommissioned, servers in regular operation, servers in early-stage deployment, etc.). Continuing the example embodiment, suppose a first availability zone includes cluster 112A and 112n. In this example, servers in respective clusters 112A and 112n are homogeneous. In other words, each server of servers 114A-114n has the same respective SKU and configuration (i.e., a first SKU and a first configuration) and each server of servers 116A-116n has the same respective SKU and configuration (i.e., a second SKU and a second configuration).
As shown in
As shown in
Computing device 102 may be any type of stationary or mobile processing device. including, but not limited to, a desktop computer, a server, a mobile or handheld device (e.g., a tablet, a personal data assistant (PDA), a smart phone, a laptop, etc.), an Internet-of-Things (IoT) device, etc. In accordance with an embodiment, computing device 102 is associated with a user (e.g., an individual user, a group of users, an organization, a family user, a customer user, an employee user, an admin user (e.g., a service team user, a developer user, a management user, etc.), etc.). Computing device 102 may access server(s) of servers 114A-114n and/or 116A-116n over network 128. Computing device 102 stores data and executes computer programs, applications, and/or services.
For example, as shown in
As noted above and in accordance with an embodiment, users are enabled to issue requests to allocate virtual machines to servers of servers 114A-114n or 116A-116n. For example, a user may interact with a user interface of computing device 102 (not shown in
Cluster selector 118 receives the allocation request and selects a subset of (or all of) clusters 112A-112n. In embodiments, the subset of clusters are valid clusters for satisfying the allocation request. As discussed further with respect to
As noted above, in embodiments, cluster selector 118 uses a search space classification model trained by search space pruner 122 to generate a subset of clusters. Search space pruner 122 trains the search space classification model based on telemetry data generated by tracking system 110. Tracking system 110 generates the telemetry data in various ways (e.g., by tracking allocations made by allocator 106, by tracking allocation requests received from applications (e.g., application 126), by monitoring servers and/or clusters of server infrastructure 104, and/or as otherwise described elsewhere herein). In accordance with an embodiment, search space pruner 122 trains the search space classification model to co-optimize scalability and virtual machine allocation quality of allocator 106. In accordance with an embodiment, search space pruner automatically trains the model in the background (e.g., alongside regular operation of allocator 106). Additional details regarding training search space classification models are discussed with respect to
Deployer 124 deploys the search space classification model trained by search space pruner 122 to allocator 106. By deploying the search space classification model in this manner, search space pruner 122 is able to (e.g., continuously) update and/or otherwise modify the search space classification model (e.g., through retraining) simultaneous to allocator 106 using the deployed version of the search space classification model to filter clusters. As discussed further with respect to
Allocator 106 of system 100 may be configured to allocate a virtual machine to a server in various ways, in embodiments. For instance,
For illustrative purposes, allocator 106 of
Flowchart 300 begins with step 302. In step 302, an allocation request for allocating a virtual machine to a plurality of clusters is received. For instance, cluster validator 202 receives an (e.g., virtual machine) allocation request 218 for allocating a virtual machine to clusters 112A-112n. In accordance with an embodiment, allocation request 218 is received as a service request to allocate virtual machines for performing a workload. In accordance with a further embodiment, the service request comprises multiple allocation requests (including allocation request 218) for allocating respective virtual machines. In this context, cluster validator 202 (and other subservices of allocator 106) may sequentially process each allocation request included in the received service request, as described elsewhere herein. In accordance with an embodiment, allocation request 218 is a simulated allocation request (e.g., a test allocation request, a troubleshooting allocation request, and/or the like).
In step 304, a valid set of clusters is generated. The valid set of clusters includes clusters of the plurality of clusters that satisfy the allocation request. For instance, cluster validator 202 of
In step 306, an attribute associated with the allocation request is identified. For instance, attribute identifier 210 of
In step 308, a trained search space classification model is utilized to determine a truncation parameter based at least on the identified attribute. For example, search space classifier 212 of
In accordance with an alternative embodiment, search space classifier 212 uses trained model 216 to determine the truncation parameter from a predetermined set of values. For instance, as a non-limiting example, search space classifier 212 uses trained model 216 to determine a first truncation parameter with a low value if the determined search space falls in a first range of number of clusters (e.g., from one cluster to a first predetermined number of clusters), a second truncation parameter with a medium value if the determined search space falls in a second range (e.g., from a second predetermined number of clusters larger than the first predetermined number of clusters to a third predetermined number of clusters), and a third truncation parameter with a high value if the determined search space falls in a third range (e.g., from a fourth predetermined number of clusters larger than the third predetermined number of clusters to a maximum number of clusters (e.g., all valid clusters)).
As noted herein, trained model 216 is a trained search space classification model usable by search space classifier 212 to determine a truncation parameter for an allocation request based on attributes identified by attribute identifier 210. By utilizing a trained search space classification model to determine a truncation parameter for an allocation request, embodiments of allocators (and components thereof, e.g., search space classifiers) leverage a model trained on training data to dynamically and automatically adjust truncation parameters to improve search space pruning and virtual machine allocation (e.g., by co-optimizing scalability and allocation quality).
In accordance with an embodiment, trained model 216 is a rule-based model that sets a rule for inferring truncation parameters based on identified attributes. In this context, search space classifier 212 uses rules set by trained model 216 to infer the truncation parameter based on attribute(s) identified by attribute identifier 210 (e.g., included in attribute signal 222). Moreover, by using a rule-based model, trained model 216 may be trained “offline” (e.g., separate from the current operation of allocator 106).
In accordance with another embodiment, trained model 216 is a ML model that infers truncation parameters in near-real time based on identified attributes. In this context, search space classifier 212 provides attribute(s) included in attribute signal 222 to trained model 216 (e.g., as ML features). Trained model 216 receives the attribute(s) as input, determines the truncation parameter based on the inputted attribute(s), and provides a truncation parameter as a result to search space classifier 212.
Additional details regarding determining truncation parameters using trained model 216 are discussed further with respect to
In step 310, the valid set of clusters is filtered based on the truncation parameter. For example, truncator 214 of
As described elsewhere herein, trained model 216 in accordance with an embodiment is trained to determine truncation parameters in a manner that improves scalability and/or improves allocation quality. Thus, by using the truncation parameter determined by search space classifier 212 using trained model 216, truncator 214 filters valid set of clusters in a manner that provides a reduced number of eligible clusters for satisfying allocation request 218 (i.e., filtered valid set of clusters 228) that co-optimizes scalability and allocation quality. Thus, the number of compute resources used and/or the amount of time taken to select a server (e.g., as discussed further with respect to step 312) from filtered valid set of clusters 228 is reduced (i.e., as compared to the number of compute resources that would have been used and/or the amount of time that would have been taken to select a server from all of valid set of clusters 226).
In step 312, a server is selected from the filtered valid set of clusters. For instance, server selector 120 of
As noted above, server validator 206 and server filter 208 perform a late search space pruning process to generate a filtered valid set of servers from filtered valid set of clusters 228. The filtered valid set of servers may be generated in various ways, in embodiments. For instance, server validator 206 in accordance with an embodiment filters servers from valid set of clusters 228 that are ineligible for fulfilling allocation request 218 (e.g., a server whose compute resources are exhausted by virtual machines already hosted by the server). As a non-limiting example, suppose cluster 112A is included in valid set of clusters 228 and server 114A already has virtual machines installed thereon. Further suppose server 114A does not have enough storage space (or other compute resources) to host the virtual machine requested in allocation request 218. In this context, server validator 206 filters server 114A from servers in filtered valid set of clusters 228 to generate valid set of servers 230 (i.e., wherein valid set of servers 230 does not include server 114A).
As stated above, server filter 208 further filters valid set of servers 230 to generate a filtered valid set of servers. Server filter 208 may filter valid set of servers 230 in various ways, in embodiments. For instance, server filtered 208 in accordance with an embodiment is configured to filter valid set of servers 230 according to preferential server selection objectives. Examples of preferential server selection objectives include, but are not limited to, an objective to tightly allocate a virtual machine, an objective to consume a server other than a heathy empty server (HES), an objective to prioritize preserving compute capacity in a particular generation (e.g., the latest generation) of hardware, and/or any other objective that server filter 208 may use to determine which servers to include in the filtered valid set of servers and/or which servers to remove from (e.g., filter from) valid set of servers 230.
Continuing the non-limiting example described with respect to server validator 206, suppose server 114n, server 116A, and server 116n are included in valid set of servers 230. Further suppose server 116A is a HES and server 114n is a latest generation server. In this non-limiting example, server filter 208 filters servers from valid set of servers 230 according to a first objective to prioritize preserving compute capacity in the latest generation of servers and avoid consuming an HES. Accordingly, server filter 208 filters server 114n and server 116A from valid set of servers 230, and server 116n is included in the filtered valid set of servers (as well as any other servers that server filter 208 does not filter from valid set of servers 230).
In step 314, the virtual machine is allocated to the selected server. For instance, server filter 208 of
As described elsewhere herein and in accordance with some embodiments, allocation request 218 is received in a service request comprising multiple allocation requests. In this context, commitment request 232 comprises instructions that cause server infrastructure 104 to fulfill each of the allocation requests included in the service request by allocating the respective virtual machines to the respective selected servers. Alternatively, a separate commitment request is transmitted for each allocation request included in the service request.
As described herein, search space classifier 212 uses trained model 216 to determine truncation parameters based on identified attributes. In a first non-limiting example, suppose the identified attribute is an allocation constraint that specifies the requested virtual machine is a B-series virtual machine cannot be located on a server that hosts a non-B-series virtual machine. In this context, search space classifier 212 determines, by using trained model 216, a truncation parameter with a higher value than if there was not an allocation constraint. Thus, truncator 214 will filter fewer clusters from valid set of clusters 220 to generate filtered valid set of clusters 228. In this example, the likelihood of a server in filtered valid set of clusters 228 is a valid (and preferred) server for allocating the requested B-series virtual machine to is higher than if truncator 214 had aggressively filtered clusters from valid set of clusters 220.
In a second non-limiting example, suppose allocation request 218 is a request to allocate a virtual machine small in size. Further suppose attribute identifier 210 identifies a first attribute indicating the virtual machine is small in size and a second attribute indicating the virtual machine is a commonly used type of virtual machine. In this context, search space classifier 212 determines, by using trained model 216, a truncation parameter with a (e.g., relatively) low value. Thus, truncator 214 will filter more clusters from valid set of clusters 220 to generate filtered valid set of clusters 228. Thus, the time to allocate the requested virtual machine to a server is reduced (i.e., in comparison to not using a trained model to determine a truncation parameter).
In a third non-limiting example, suppose attribute identifier 210 identifies an attribute of valid set of clusters 220 that indicates the available servers are from an older server generation (e.g., a less performant server generation). In this context, search space classifier 212 determines, by using trained model 216, a truncation parameter with a higher value than if the available servers were from a newer (e.g., higher performance) server generation.
In a fourth non-limiting example, suppose attribute identifier 210 identifies an attribute of allocation request 218 that indicates the requested virtual machine is a legacy virtual machine. In this context, search space classifier 212 determines, by using trained model 216, a truncation parameter with a value such that truncator 214 does not filter clusters from valid set of clusters 220. In this context, the candidate server pool is kept at a maximum size of valid clusters of valid servers, thereby minimizing the probability of legacy virtual machines consuming a healthy empty server and spreading to multiple servers. Thus, the probability of regressing servers in the inventory to undesirable configurations is reduced.
In a fifth non-limiting example, suppose attribute identifier 210 identifies an attribute of allocation request 218 that indicates the requested virtual machine is an actual virtual machine that is to be converted from a PPS virtual machine. In this context, the number of servers that may host the virtual machine may be relatively small (e.g., servers that have space for the virtual machine and include properties in a moniker of the PPS virtual machine. In this example, search space classifier 212 determines, by using trained model 216, a truncation parameter with a relatively high value is determined. Thus, the likeliness of locating a suitable (e.g., a valid and preferred) server in filtered valid set of clusters 228 is increased.
As described herein, allocator 106 utilizes a trained search space classification model to determine truncation parameters for filtering a valid set of clusters (also referred to as “early search space pruning” herein). Depending on the implementation, a search space classification model may be trained to determine a truncation parameter in a manner that prioritizes one or more metrics, co-optimizes two or more metrics, and/or otherwise improves virtual machine allocation. In accordance with one or more embodiments, the search space classification model is trained using a search space pruning system that leverages analysis of various metrics. Examples of such metrics include, but are not limited to, performance indicators (also referred to as key performance indicators (KPIs)) (e.g., a service allocation time KPI indicating a performance objective for the time to allocate a virtual machine, an allocation quality KPI indicating how well a virtual machine fits the server it was allocated to (e.g., how optimized a server is for hosting a particular virtual machine), a spot virtual machine eviction rate KPI indicating how often spot virtual machines are evicted from an inventory, a mission critical customer KPI indicating how often virtual machines are allocated on healthier servers for mission critical customers (e.g., key customers or prioritized customers), and/or any other metric suitable for indicating the performance of allocator 106 in allocating virtual machines), metrics associated with (e.g., previously submitted or currently submitted) allocation requests (e.g., region the allocation request originated from, the computing device or application that issued the allocation request, a service request associated with the allocation request, a workload associated with the allocation request, the type of the requested virtual machine, the size of the requested virtual machine, etc.), metrics associated with an inventory (e.g., clusters 112A-112n) (e.g., cluster shape, fragmentation, server attributes (e.g., server health, server hardware, operating configurations, etc.), regions and/or zones of the inventory, etc.), and/or any other metrics associated with system 100 that may be utilized by search space pruning system 108 to train a search space classification model.
Search space pruning system 108 of system 100, as described with respect to
For illustrative purposes, search space pruning system 108 of
Flowchart 500 begins with step 502. In step 502, telemetry data is received. For example, attribute analyzer 402 of
In step 504, an analysis summary is generated based on an analysis of the telemetry data. For example, attribute analyzer 402 of
In accordance with an embodiment, analysis summary 408 is summarized at a global level (i.e., comprising the entire inventory (e.g., clusters 112A-112n)), at a regional level, at an availability zone level, at a cluster level, and/or at another level of granularity for summarizing an analysis of allocation request. In this context, analysis summary 408 may comprise multiple summaries at different respective granularities. For instance, suppose the inventory of clusters 112A-112n is separated into two regions (“Region A” and “Region B”), and Region A is separated into three availability zones (“AZ1”, “AZ2”, and “AZ3”). Depending on the implementation, analysis summary 408 comprises a summary for the entire inventory, a summary for Region A, a summary for Region B, a summary for AZ1, a summary for AZ2, and/or a summary for AZ3). By analyzing telemetry data in different levels of granularity, respective patterns and characteristics for those levels of granularities can be identified and used for training trained model 216 with respect to that particular granularity.
In step 506, the search space classification model is trained to determine truncation parameters based on the analysis summary. For example, classification model trainer 404 of
Classification model trainer 404 may train search space classification models to determine truncation parameters in various ways, as described elsewhere herein. For instance, as a non-limiting example, suppose analysis summary 408 comprises an analysis of health empty server consumption rate for previous allocation requests. Further suppose classification model trainer 404 is configured to train model 216 to reduce the probability of a virtual machine being allocated to a health empty server. In this context, classification model trainer 404 determines a number of available servers required to reduce the probability of consuming a health empty server to below a certain threshold (e.g., a near-zero threshold) and trains the search space classification model to determine truncation parameters that satisfy this requirement.
In another non-limiting example, suppose analysis summary 408 comprises a server compression factor (SCF) for previous allocations made by allocator 106. Further suppose analysis summary 408 includes information regarding the size of the virtual machines allocated by the previous allocations. In this example, the smaller the size of a virtual machine, the higher the corresponding SCF. Therefore, classification model trainer 404 trains the search space classification model to determine truncation parameters such that the size of the virtual machine is inversely proportional to how aggressive a valid set of clusters is truncated (e.g., determine an aggressive truncation parameter for a small virtual machine and a less aggressive truncation parameter for a large virtual machine). Further details regarding determining truncation parameters based on SCFs are described with respect to
In step 508, the trained search space classification model is provided to the allocator. For example, classification model trainer 404 of
Search space pruner 122 of
Flowchart 600 begins with step 602. In step 602, an expected allocation time is determined using trained model 216. For example, classification model trainer 404 of
In step 604, a determination that the expected allocation time has a predetermined relationship with a threshold is made. For example, classification model trainer 404 of
In step 606, the trained search space classification model is retrained. For example, classification model trainer 404 of
Search space pruner 122 may provide the trained search space classification model to allocator 106 in various ways, in embodiments. For instance, as shown in
Flowchart 700 begins with step 702. In step 702, the trained search space classification model is received from the search space pruner. For example, deployer 124 of
In step 704, display of data representative of the trained search space classification model in a user interface of a computing device is caused. For example, deployer 124 of
In this context, application 126 enables a user (e.g., a customer determining truncation parameters for their subscription to an inventory, a developer of allocator 106 determining truncation parameters for all or a subset of allocator 106, and/or the like) to view the data representative of trained model 216 and approve, disapprove, set, or select rules for inferring truncation parameters. In accordance with an embodiment, application 126 presents a set of rules based on trained model 216. In accordance with another embodiment, application 126 enables the user to specify or modify rules.
In step 706, a rule for inferring truncation parameters based on the identified attribute is received from the computing device. For example, deployer 124 of
In step 708, the rule is transmitted to the allocator. For example, deployer 124 of
As described herein, trained model 216 of
As noted above, search space classifier 212 of
Flowchart 800A begins with step 802. In step 802, a server compression factor (SCF) is determined based on the identified attribute. For example, search space classifier 212 determines the SCF based on the attribute identified by attribute identifier 210. In accordance with an embodiment, the identified attribute is the SCF. In accordance with another embodiment, the identified attribute is a type of virtual machine, a size of a virtual machine, and/or another attribute of allocation request 218, as described elsewhere herein. In this latter context, search space classifier 212 determines the SCF based on the identified attribute (e.g., according to previous allocation requests fulfilled by allocator 106 with attributes similar to the identified attribute of allocation request 218). In accordance with an embodiment, search space classifier 212 uses trained model 216 to determine the SCF. In accordance with an embodiment, the SCF is determined as a sub-step of determining the truncation parameter.
In accordance with an embodiment, the SCF is determined according to the following equation:
wherein SCPostValidation is the number of servers after server validator 206 filters servers from servers of a filtered valid set of clusters 228 provided thereto and SCPreValidation is the number of servers in the provided filtered valid set of clusters. As described herein, trained model 216 is trained based on previously fulfilled allocation requests. In this context, the determined SCF is an expected SCF based on the data trained model 216 was trained on.
In step 804, the truncation parameter is determined based on the determined server compression factor. For example, search space classifier 212 utilizes trained model 216 to determine the truncation parameter based on the SCF determined in step 802. In embodiments, a small SCF corresponds to server validation rules that are highly discriminating, thus leaving few servers eligible for fulfilling a given allocation request. This could be due to various factors (e.g., an allocation constraint for the requested virtual machine) described elsewhere herein. A large SCF corresponds to server validation rules that are not highly discriminating, thus leaving many eligible servers for fulfilling a given allocation request. In other words, the SCF may be considered a measure of how aggressive server validation rules are for pruning candidate servers for a particular allocation request. By determining a truncation parameter based on the expected SCF for a particular allocation request, trained model 216 (or search space classifier 212 using trained model 216) determines how aggressive late search space pruning (e.g., typically) is expected to be for the allocation request. For instance, if SCF is high (i.e., many valid servers after filtering by server validator 206), the determined truncation parameter causes truncator 214 to aggressively filter clusters from valid set of clusters 220, therefore improving scalability with relatively low (or negligible) impact to allocation quality.
As noted above, search space classifier 212 of
Flowchart 800B begins with step 812. Depending on the implementation, step 812 may be a further embodiment of step 306 or step 308 of flowchart 300. In step 812, a level of priority of the virtual machine is determined to have a predetermined level with a threshold based on the identified attribute. For example, attribute identifier 210 identifies an attribute corresponding with a level of priority of the requested virtual machine. The level of priority indicates whether or not the allocation of the requested virtual machine should be prioritized above other virtual machines (e.g., other requested virtual machines (e.g., requested in a service request), virtual machines already allocated to servers of valid set of clusters 220. Depending on the implementation, the identified attribute may be the level priority (e.g., a level of priority included in allocation request 218, a level of priority assigned to a particular type of virtual machine (e.g., by a user or customer that issued the request, by a customer that subscribes to services provided by a cloud provider associated with allocator 106, by a developer of allocator 106, and/or the like), and/or the like) or attributes that, when provided to search trained model 216 (or space classifier 212), enable trained model 216 (or space classifier 212 using trained model 216) to determine the level of priority of the requested virtual machine.
Flowchart 800B continues with step 814. Depending on the implementation, step 814 may be a further embodiment of step 306 or step 308 of flowchart 300. In step 814, a Spot virtual machine eviction rate is determined based on the identified attribute. For example, attribute identifier 210 identifies an attribute corresponding to a Spot virtual machine eviction rate (a “Spot eviction rate” herein). In accordance with an embodiment, attribute identifier 210 identifies the Spot eviction rate based on an eviction rate of Spot virtual machines hosted by servers of valid set of clusters 220 over a period of time (e.g., in the last hour, in the last day, in the last week, since a predetermined time, in a particular time window). In accordance with an embodiment, attribute identifier 210 identifies the Spot eviction rate based on an eviction rate of Spot virtual machines hosted by servers of all clusters in an inventory (e.g., clusters 112A-112n).
Flowchart 800B continues with step 816. In accordance with an embodiment, step 816 is a further embodiment of step 308 of flowchart 300. In step 816, the truncation parameter is determined based on the determined level of priority and the determined Spot eviction rate. For example, search space classifier 212 of
As noted above, search space classifier 212 of
Flowchart 800C begins with step 822. In step 822, a determination that a first performance indicator should be prioritized over a second performance indicator is made. For example, search space classifier 212 of
In step 824, a weight value is applied to the first performance indicator that increases a weight of the first performance indicator in determining the truncation parameter. For example, search space classifier 212 of
In step 826, the truncation parameter is determined based on the weighted identified attribute. For example, search space classifier 212 of
While flowchart 800C of
Moreover, while flowchart 800C of
As noted herein, the embodiments described, along with any circuits, components and/or subcomponents thereof, as well as the flowcharts/flow diagrams described herein, including portions thereof, and/or other embodiments, may be implemented in hardware, or hardware with any combination of software and/or firmware, including being implemented as computer program code configured to be executed in one or more processors and stored in a computer readable storage medium, or being implemented as hardware logic/electrical circuitry, such as being implemented together in a system-on-chip (SoC), a field programmable gate array (FPGA), and/or an application specific integrated circuit (ASIC). A SoC may include an integrated circuit chip that includes one or more of a processor (e.g., a microcontroller, microprocessor, digital signal processor (DSP), etc.), memory, one or more communication interfaces, and/or further circuits and/or embedded firmware to perform its functions.
Embodiments disclosed herein may be implemented in one or more computing devices that may be mobile (a mobile device) and/or stationary (a stationary device) and may include any combination of the features of such mobile and stationary computing devices. Examples of computing devices in which embodiments may be implemented are described as follows with respect to
Computing device 902 can be any of a variety of types of computing devices. For example, computing device 902 may be a mobile computing device such as a handheld computer (e.g., a personal digital assistant (PDA)), a laptop computer, a tablet computer (such as an Apple iPad™), a hybrid device, a notebook computer (e.g., a Google Chromebook™ by Google LLC), a netbook, a mobile phone (e.g., a cell phone, a smart phone such as an Apple® iPhone® by Apple Inc., a phone implementing the Google® Android™ operating system, etc.), a wearable computing device (e.g., a head-mounted augmented reality and/or virtual reality device including smart glasses such as Google® Glass™, Oculus Rift® of Facebook Technologies, LLC, etc.), or other type of mobile computing device. Computing device 902 may alternatively be a stationary computing device such as a desktop computer, a personal computer (PC), a stationary server device, a minicomputer, a mainframe, a supercomputer, etc.
As shown in
A single processor 910 (e.g., central processing unit (CPU), microcontroller, a microprocessor, signal processor, ASIC (application specific integrated circuit), and/or other physical hardware processor circuit) or multiple processors 910 may be present in computing device 1002 for performing such tasks as program execution, signal coding, data processing, input/output processing, power control, and/or other functions. Processor 910 may be a single-core or multi-core processor, and each processor core may be single-threaded or multithreaded (to provide multiple threads of execution concurrently). Processor 910 is configured to execute program code stored in a computer readable medium, such as program code of operating system 912 and application programs 914 stored in storage 920. Operating system 912 controls the allocation and usage of the components of computing device 902 and provides support for one or more application programs 914 (also referred to as “applications” or “apps”). Application programs 914 may include common computing applications (e.g., e-mail applications, calendars, contact managers, web browsers, messaging applications), further computing applications (e.g., word processing applications, mapping applications, media player applications, productivity suite applications), one or more machine learning (ML) models, as well as applications related to the embodiments disclosed elsewhere herein.
Any component in computing device 902 can communicate with any other component according to function, although not all connections are shown for ease of illustration. For instance, as shown in
Storage 920 is physical storage that includes one or both of memory 956 and storage device 990, which store operating system 912, application programs 914, and application data 916 according to any distribution. Non-removable memory 922 includes one or more of RAM (random access memory), ROM (read only memory), flash memory, a solid-state drive (SSD), a hard disk drive (e.g., a disk drive for reading from and writing to a hard disk), and/or other physical memory device type. Non-removable memory 922 may include main memory and may be separate from or fabricated in a same integrated circuit as processor 910. As shown in
One or more programs may be stored in storage 920. Such programs include operating system 912, one or more application programs 914, and other program modules and program data. Examples of such application programs may include, for example, computer program logic (e.g., computer program code/instructions) for implementing one or more of allocator 106, search space pruning system 108, tracking system 110, cluster selector 118, server selector 120, search space pruner 122, deployer 124, application 126, cluster validator 202, cluster filter 204, server validator 206, server filter 208, attribute identifier 210, search space classifier 212, truncator 214, trained model 216, attribute analyzer 402, and/or classification model trainer 404, along with any components and/or subcomponents thereof, as well as the flowcharts/flow diagrams (e.g., flowcharts 300, 500, 600, 700, 800A, 800B, and/or 800C) described herein, including portions thereof, and/or further examples described herein.
Storage 920 also stores data used and/or generated by operating system 912 and application programs 914 as application data 916. Examples of application data 916 include web pages, text, images, tables, sound files, video data, and other data, which may also be sent to and/or received from one or more network servers or other devices via one or more wired or wireless networks. Storage 920 can be used to store further data including a subscriber identifier, such as an International Mobile Subscriber Identity (IMSI), and an equipment identifier, such as an International Mobile Equipment Identifier (IMEI). Such identifiers can be transmitted to a network server to identify users and equipment.
A user may enter commands and information into computing device 902 through one or more input devices 930 and may receive information from computing device 1002 through one or more output devices 950. Input device(s) 930 may include one or more of touch screen 932, microphone 934, camera 936, physical keyboard 938 and/or trackball 940 and output device(s) 950 may include one or more of speaker 952 and display 954. Each of input device(s) 930 and output device(s) 950 may be integral to computing device 902 (e.g., built into a housing of computing device 902) or external to computing device 902 (e.g., communicatively coupled wired or wirelessly to computing device 902 via wired interface(s) 980 and/or wireless modem(s) 960). Further input devices 930 (not shown) can include a Natural User Interface (NUI), a pointing device (computer mouse), a joystick, a video game controller, a scanner, a touch pad, a stylus pen, a voice recognition system to receive voice input, a gesture recognition system to receive gesture input, or the like. Other possible output devices (not shown) can include piezoelectric or other haptic output devices. Some devices can serve more than one input/output function. For instance, display 954 may display information, as well as operating as touch screen 932 by receiving user commands and/or other information (e.g., by touch, finger gestures, virtual keyboard, etc.) as a user interface. Any number of each type of input device(s) 930 and output device(s) 950 may be present, including multiple microphones 934, multiple cameras 936, multiple speakers 952, and/or multiple displays 954.
One or more wireless modems 960 can be coupled to antenna(s) (not shown) of computing device 902 and can support two-way communications between processor 910 and devices external to computing device 902 through network 904, as would be understood to persons skilled in the relevant art(s). Wireless modem 960 is shown generically and can include a cellular modem 966 for communicating with one or more cellular networks, such as a GSM network for data and voice communications within a single cellular network, between cellular networks, or between the mobile device and a public switched telephone network (PSTN). Wireless modem 960 may also or alternatively include other radio-based modem types, such as a Bluetooth modem 964 (also referred to as a “Bluetooth device”) and/or Wi-Fi 1062 modem (also referred to as an “wireless adaptor”). Wi-Fi modem 962 is configured to communicate with an access point or other remote Wi-Fi-capable device according to one or more of the wireless network protocols based on the IEEE (Institute of Electrical and Electronics Engineers) 802.11 family of standards, commonly used for local area networking of devices and Internet access. Bluetooth modem 1064 is configured to communicate with another Bluetooth-capable device according to the Bluetooth short-range wireless technology standard(s) such as IEEE 802.15.1 and/or managed by the Bluetooth Special Interest Group (SIG).
Computing device 902 can further include power supply 982, LI receiver 984, accelerometer 986, and/or one or more wired interfaces 980. Example wired interfaces 980 include a USB port, IEEE 1394 (Fire Wire) port, a RS-232 port, an HDMI (High-Definition Multimedia Interface) port (e.g., for connection to an external display), a DisplayPort port (e.g., for connection to an external display), an audio port, an Ethernet port, and/or an Apple® Lightning® port, the purposes and functions of each of which are well known to persons skilled in the relevant art(s). Wired interface(s) 980 of computing device 902 provide for wired connections between computing device 902 and network 904, or between computing device 902 and one or more devices/peripherals when such devices/peripherals are external to computing device 902 (e.g., a pointing device, display 954, speaker 952, camera 936, physical keyboard 938, etc.). Power supply 982 is configured to supply power to each of the components of computing device 902 and may receive power from a battery internal to computing device 902, and/or from a power cord plugged into a power port of computing device 902 (e.g., a USB port, an A/C power port). LI receiver 984 may be used for location determination of computing device 902 and may include a satellite navigation receiver such as a Global Positioning System (GPS) receiver or may include other type of location determiner configured to determine location of computing device 902 based on received information (e.g., using cell tower triangulation, etc.). Accelerometer 1086 may be present to determine an orientation of computing device 902.
Note that the illustrated components of computing device 902 are not required or all-inclusive, and fewer or greater numbers of components may be present as would be recognized by one skilled in the art. For example, computing device 902 may also include one or more of a gyroscope, barometer, proximity sensor, ambient light sensor, digital compass, etc. Processor 910 and memory 956 may be co-located in a same semiconductor device package, such as being included together in an integrated circuit chip, FPGA, or system-on-chip (SOC), optionally along with further components of computing device 902.
In embodiments, computing device 902 is configured to implement any of the above-described features of flowcharts herein. Computer program logic for performing any of the operations, steps, and/or functions described herein may be stored in storage 920 and executed by processor 910.
In some embodiments, server infrastructure 970 may be present in computing environment 900 and may be communicatively coupled with computing device 902 via network 904. Server infrastructure 970, when present, may be a network-accessible server set (e.g., a cloud computing platform). As shown in
Each of nodes 974 may, as a compute node, comprise one or more server computers, server systems, and/or computing devices. For instance, a node 974 may include one or more of the components of computing device 902 disclosed herein. Each of nodes 974 may be configured to execute one or more software applications (or “applications”) and/or services and/or manage hardware resources (e.g., processors, memory, etc.), which may be utilized by users (e.g., customers) of the network-accessible server set. For example, as shown in
In an embodiment, one or more of clusters 972 may be co-located (e.g., housed in one or more nearby buildings with associated components such as backup power supplies, redundant data communications, environmental controls, etc.) to form a datacenter, or may be arranged in other manners. Accordingly, in an embodiment, one or more of clusters 972 may be a datacenter in a distributed collection of datacenters. In embodiments, exemplary computing environment 900 comprises part of a cloud-based platform such as Amazon Web Services® of Amazon Web Services, Inc. or Google Cloud Platform™ of Google LLC, although these are only examples and are not intended to be limiting.
In an embodiment, computing device 902 may access application programs 976 for execution in any manner, such as by a client application and/or a browser at computing device 902. Example browsers include Microsoft Edge® by Microsoft Corp. of Redmond, Washington, Mozilla Firefox®, by Mozilla Corp. of Mountain View, California, Safari®, by Apple Inc. of Cupertino, California, and Google® Chrome by Google LLC of Mountain View, California.
For purposes of network (e.g., cloud) backup and data security, computing device 902 may additionally and/or alternatively synchronize copies of application programs 914 and/or application data 916 to be stored at network-based server infrastructure 970 as application programs 976 and/or application data 978. For instance, operating system 912 and/or application programs 914 may include a file hosting service client, such as Microsoft® OneDrive® by Microsoft Corporation, Amazon Simple Storage Service (Amazon S3)® by Amazon Web Services, Inc., Dropbox® by Dropbox, Inc., Google Drive™ by Google LLC, etc., configured to synchronize applications and/or data stored in storage 920 at network-based server infrastructure 970.
In some embodiments, on-premises servers 992 may be present in computing environment 900 and may be communicatively coupled with computing device 902 via network 904. On-premises servers 992, when present, are hosted within an organization's infrastructure and, in many cases, physically onsite of a facility of that organization. On-premises servers 992 are controlled, administered, and maintained by IT (Information Technology) personnel of the organization or an IT partner to the organization. Application data 998 may be shared by on-premises servers 992 between computing devices of the organization, including computing device 902 (when part of an organization) through a local network of the organization, and/or through further networks accessible to the organization (including the Internet). Furthermore, on-premises servers 992 may serve applications such as application programs 996 to the computing devices of the organization, including computing device 902. Accordingly, on-premises servers 992 may include storage 994 (which includes one or more physical storage devices such as storage disks and/or SSDs) for storage of application programs 1096 and application data 998 and may include one or more processors for execution of application programs 996. Still further, computing device 902 may be configured to synchronize copies of application programs 914 and/or application data 916 for backup storage at on-premises servers 992 as application programs 996 and/or application data 998.
Embodiments described herein may be implemented in one or more of computing device 902, network-based server infrastructure 970, and on-premises servers 992. For example, in some embodiments, computing device 902 may be used to implement systems, clients, or devices, or components/subcomponents thereof, disclosed elsewhere herein. In other embodiments, a combination of computing device 902, network-based server infrastructure 970, and/or on-premises servers 992 may be used to implement the systems, clients, or devices, or components/subcomponents thereof, disclosed elsewhere herein.
As used herein, the terms “computer program medium,” “computer-readable medium,” and “computer-readable storage medium,” etc., are used to refer to physical hardware media. Examples of such physical hardware media include any hard disk, optical disk, SSD, other physical hardware media such as RAMs, ROMs, flash memory, digital video disks, zip disks, MEMs (microelectronic machine) memory, nanotechnology-based storage devices, and further types of physical/tangible hardware storage media of storage 920. Such computer-readable media and/or storage media are distinguished from and non-overlapping with communication media and propagating signals (do not include communication media and propagating signals). Communication media embodies computer-readable instructions, data structures, program modules or other data in a modulated data signal such as a carrier wave. The term “modulated data signal” means a signal that has one or more of its characteristics set or changed in such a manner as to encode information in the signal. By way of example, and not limitation, communication media includes wireless media such as acoustic, RF, infrared and other wireless media, as well as wired media. Embodiments are also directed to such communication media that are separate and non-overlapping with embodiments directed to computer-readable storage media.
As noted above, computer programs and modules (including application programs 914) may be stored in storage 920. Such computer programs may also be received via wired interface(s) 980 and/or wireless modem(s) 960 over network 904. Such computer programs, when executed or loaded by an application, enable computing device 902 to implement features of embodiments discussed herein. Accordingly, such computer programs represent controllers of the computing device 902.
Embodiments are also directed to computer program products comprising computer code or instructions stored on any computer-readable medium or computer-readable storage medium. Such computer program products include the physical storage of storage 920 as well as further physical storage types.
A system in a cloud computing environment is described herein. The system comprises a plurality of clusters and an allocator. Each cluster of the plurality of clusters comprises respective servers. The allocator configured to: receive an allocation request for allocating a virtual machine to the plurality of clusters; generate a valid set of clusters that includes clusters of the plurality of clusters that satisfy the allocation request; identifies an attribute associated with the allocation request; utilize a trained search space classification model to determine a truncation parameter based at least on the identified attribute; filter the valid set of clusters based on the truncation parameter; select a server from the filtered valid set of clusters; and allocate the virtual machine to the selected server.
In one implementation of the foregoing system, the identified attribute comprises at least one of: a region a computing device is located in, the computing device having transmitted the allocation request to the allocator; a type of the virtual machine to be allocated; a number of virtual machines to be allocated; a user account associated with the allocation request; a type of a cluster in the valid set of clusters; or an age of the cluster in the valid set of clusters.
In one implementation of the foregoing system, the system further comprises a search space pruner configured to: receive telemetry data; generate an analysis summary based on an analysis of the telemetry data; train the search space classification model to determine truncation parameters based on the analysis summary; and provides the trained search space classification model to the allocator.
In one implementation of the foregoing system, the telemetry data comprises at least one of: a previous allocation request received by the allocator; region information associated with the previous allocation request; a previous allocation made by the allocator; an attribute of a cluster in the plurality of clusters; or an attribute of a respective server of the plurality of clusters.
In one implementation of the foregoing system, the search space pruner is configured to: determine an expected allocation time using the trained search space classification model; determine the expected allocation time has a predetermined relationship with a threshold; and retrain the trained search space classification model.
In one implementation of the foregoing system, the system further comprises a deployer configured to: receive the trained search space classification model from the search space pruner; cause display of data representative of the trained search space classification model in a user interface of a computing device; receive, from the computing device, a rule for inferring truncation parameters based on the identified attribute; and transmit the rule to the allocator.
In one implementation of the foregoing system, the trained search space classification model is a rule-based model that sets a rule for inferring the truncation parameter based on the identified attribute.
In one implementation of the foregoing system, the trained search space classification model is a machine learning model that infers the truncation parameter in near-real time based on the identified attribute.
In one implementation of the foregoing system, the allocator is configured to utilize the trained search space classification model to determine a truncation parameter by: determining a server compression factor based on the identified attribute; and determining the truncation parameter based on the determined server compression factor.
In one implementation of the foregoing system, the allocator is configured to utilize the trained search space classification model to determine a truncation parameter by: determining, based on the identified attribute, a level of priority of the virtual machine has a predetermined level with a threshold; determining, based on the identified attribute, a Spot virtual machine eviction rate; and determining the truncation parameter based on the determined level of priority and the determined Spot virtual machine eviction rate.
In one implementation of the foregoing system, the allocator is configured to utilize the trained search space classification model to determine a truncation parameter by: determining a first performance indicator should be prioritized over a second performance indicator; applying a weight value to the first performance indicator that increases a weight of the first performance indicator in determining the truncation parameter; and determine the truncation parameter based on the weighted first performance indicator.
A method for allocating a virtual machine to a plurality of clusters in a cloud computing environment is described herein. The method comprises: receiving an allocation request for allocating the virtual machine to the plurality of clusters; generating a valid set of clusters that includes clusters of the plurality of clusters that satisfy the allocation request; identifying an attribute associated with the allocation request; utilizing a trained search space classification model to determine a truncation parameter based at least on the identified attribute; filtering the valid set of clusters based on the truncation parameter; selecting a server from the filtered valid set of clusters; and allocating the virtual machine to the selected server.
In one implementation of the foregoing method, the identified attribute comprises at least one of: a region a computing device is located in, the computing device having transmitted the allocation request; a type of the virtual machine to be allocated; a number of virtual machines to be allocated; a user account associated with the allocation request; a type of a cluster in the valid set of clusters; or an age of the cluster in the valid set of clusters.
In one implementation of the foregoing method, the method further comprises: receiving telemetry data; generating an analysis summary based on an analysis of the telemetry data; and training the search space classification model to determine truncation parameters based on the analysis summary.
In one implementation of the foregoing method, the method further comprises providing the trained search space classification model to the allocator.
In one implementation of the foregoing method, the telemetry data comprises at least one of: a previous allocation request received by the allocator, region information associated with the previous allocation request, a previous allocation made by the allocator, an attribute of a cluster in the plurality of clusters, or an attribute of a respective server of the plurality of clusters.
In one implementation of the foregoing method, the method further comprises: determining an expected allocation time using the trained search space classification model; determining the expected allocation time has a predetermined relationship with a threshold; and retraining the trained search space classification model.
In one implementation of the foregoing method, the method further comprises: receiving the trained search space classification model from the search space pruner; causing display of data representative of the trained search space classification model in a user interface of a computing device; receiving, from the computing device, a rule for inferring truncation parameters based on the identified attribute; and transmitting the rule to an allocator.
In one implementation of the foregoing method, the trained search space classification model is a rule-based model and said utilizing a trained search space classification model to determine the truncation parameter comprises: inferring the truncation parameter based on the identified attribute and a rule of the rule-based model.
In one implementation of the foregoing method, the trained search space classification model is a machine learning model and said utilizing a trained search space classification model to determine the truncation parameter comprises: providing the identified attribute to the trained search space classification model, and receiving, from the trained search space classification model in response to said providing the identified attribute, the truncation parameter.
In one implementation of the foregoing method, said utilizing a trained search space classification model to determine the truncation parameter comprises: determining a server compression factor based on the identified attribute; and determining the truncation parameter based on the determined server compression factor.
In one implementation of the foregoing method, said utilizing a trained search space classification model to determine the truncation comprises: determining, based on the identified attribute, a level of priority of the virtual machine has a predetermined level with a threshold; determining, based on the identified attribute, a Spot virtual machine eviction rate; and determining the truncation parameter based on the determined level of priority and the determined Spot virtual machine eviction rate.
In one implementation of the foregoing method, said utilizing a trained search space classification model to determine the truncation parameter comprises: determining a first performance indicator should be prioritized over a second performance indicator; applying a weight value to the first performance indicator that increases a weight of the first performance indicator in determining the truncation parameter; and determine the truncation parameter based on the weighted first performance indicator.
An allocator is described herein. The allocator is coupled to a plurality of clusters in a cloud computing environment. The allocator comprises a processor circuit and a memory that stores program code. The program code is executable by the processor circuit to perform operations for allocating a virtual machine to the plurality of clusters. The operations comprising: receiving an allocation request for allocating a virtual machine to the plurality of clusters; generating a valid set of clusters that includes clusters of the plurality of clusters that satisfy the allocation request; identifying an attribute associated with the allocation request; utilizing a trained search space classification model to determine a truncation parameter based at least on the identified attribute; filtering the valid set of clusters based on the truncation parameter; selecting a server from the filtered valid set of clusters; and allocating the virtual machine to the selected server.
In one implementation of the foregoing allocator, the identified attribute comprises at least one of: a region a computing device is located in, the computing device having transmitted the allocation request to the allocator; a type of the virtual machine to be allocated; a number of virtual machines to be allocated; a user account associated with the allocation request; a type of a cluster in the valid set of clusters; or an age of the cluster in the valid set of clusters.
In one implementation of the foregoing allocator, the operations further comprise: receiving telemetry data; generating an analysis summary based on an analysis of the telemetry data; and training the search space classification model to determine truncation parameters based on the analysis summary.
In one implementation of the foregoing allocator, the allocator comprises a search space pruner that trains the search space classification model.
In one implementation of the foregoing allocator, the search space pruner is implemented in a software layer of the allocator.
In one implementation of the foregoing allocator, the telemetry data comprises at least one of: a previous allocation request received by the allocator, region information associated with the previous allocation request, a previous allocation made by the allocator, an attribute of a cluster in the plurality of clusters, or an attribute of a respective server of the plurality of clusters.
In one implementation of the foregoing allocator, the operations further comprises: determining an expected allocation time using the trained search space classification model; determining the expected allocation time has a predetermined relationship with a threshold; and retraining the trained search space classification model.
In one implementation of the foregoing allocator, the operations further comprise: causing display of data representative of the trained search space classification model in a user interface of a computing device; and receiving, from the computing device, a rule for inferring truncation parameters based on the identified attribute.
In one implementation of the foregoing allocator, the trained search space classification model is a rule-based model and said utilizing a trained search space classification model to determine a truncation parameter comprises: inferring the truncation parameter based on the identified attribute and a rule of the rule-based model.
In one implementation of the foregoing allocator, the trained search space classification model is a machine learning model and said utilizing a trained search space classification model to determine a truncation parameter comprises: providing the identified attribute to the trained search space classification model, and receiving, from the trained search space classification model in response to said providing the identified attribute, the truncation parameter.
In one implementation of the foregoing allocator, said utilizing a trained search space classification model to determine a truncation parameter comprises: determining a server compression factor based on the identified attribute; and determining the truncation parameter based on the determined server compression factor.
In one implementation of the foregoing allocator, said utilizing a trained search space classification model to determine a truncation parameter comprises: determining, based on the identified attribute, a level of priority of the virtual machine has a predetermined level with a threshold; determining, based on the identified attribute, a Spot virtual machine eviction rate; and determining the truncation parameter based on the determined level of priority and the determined Spot virtual machine eviction rate.
In one implementation of the foregoing allocator, said utilizing a trained search space classification model to determine a truncation parameter comprises: determining a first performance indicator should be prioritized over a second performance indicator; applying a weight value to the first performance indicator that increases a weight of the first performance indicator in determining the truncation parameter; and determine the truncation parameter based on the weighted first performance indicator.
References in the specification to “one embodiment,” “an embodiment,” “an example embodiment,” etc., indicate that the embodiment described may include a particular feature, structure, or characteristic, but every embodiment may not necessarily include the particular feature, structure, or characteristic. Moreover, such phrases are not necessarily referring to the same embodiment. Further, when a particular feature, structure, or characteristic is described in connection with an embodiment, it is submitted that it is within the knowledge of one skilled in the art to affect such feature, structure, or characteristic in connection with other embodiments whether or not explicitly described.
In the discussion, unless otherwise stated, adjectives modifying a condition or relationship characteristic of a feature or features of an implementation of the disclosure, should be understood to mean that the condition or characteristic is defined to within tolerances that are acceptable for operation of the implementation for an application for which it is intended. Furthermore, if the performance of an operation is described herein as being “in response to” one or more factors, it is to be understood that the one or more factors may be regarded as a sole contributing factor for causing the operation to occur or a contributing factor along with one or more additional factors for causing the operation to occur, and that the operation may occur at any time upon or after establishment of the one or more factors. Still further, where “based on” is used to indicate an effect being a result of an indicated cause, it is to be understood that the effect is not required to only result from the indicated cause, but that any number of possible additional causes may also contribute to the effect. Thus, as used herein, the term “based on” should be understood to be equivalent to the term “based at least on.”
Numerous example embodiments have been described above. Any section/subsection headings provided herein are not intended to be limiting. Embodiments are described throughout this document, and any type of embodiment may be included under any section/subsection. Furthermore, embodiments disclosed in any section/subsection may be combined with any other embodiments described in the same section/subsection and/or a different section/subsection in any manner.
Furthermore, example embodiments have been described above with respect to one or more running examples. Such running examples describe one or more particular implementations of the example embodiments; however, embodiments described herein are not limited to these particular implementations.
Further still, in some example embodiments, example truncation parameters have been described with a high value corresponding to a large search space and a low value corresponding to a small search space. However, it is also contemplated herein that some embodiments may use truncation parameters with the reverse (i.e., a high value truncation parameter corresponds to a small search space and a low value truncation parameter corresponds to a high search space).
Further still, several example embodiments have been described with respect to cloud database applications. However, it is also contemplated herein that embodiments of search space pruning for virtual machine allocation may be used in other applications (e.g., IAAS applications, PAAS applications, SAAS applications, and/or the like).
Moreover, according to the described embodiments and techniques, any components of systems, computing devices, servers, allocators, search space pruning systems, search space classification models, tracking systems, and/or deployers and their functions may be caused to be activated for operation/performance thereof based on other operations, functions, actions, and/or the like, including initialization, completion, and/or performance of the operations, functions, actions, and/or the like.
In some example embodiments, one or more of the operations of the flowcharts described herein may not be performed. Moreover, operations in addition to or in lieu of the operations of the flowcharts described herein may be performed. Further, in some example embodiments, one or more of the operations of the flowcharts described herein may be performed out of order, in an alternate sequence, or partially (or completely) concurrently with each other or with other operations.
The embodiments described herein and/or any further systems, sub-systems, devices and/or components disclosed herein may be implemented in hardware (e.g., hardware logic/electrical circuitry), or any combination of hardware with software (computer program code configured to be executed in one or more processors or processing devices) and/or firmware.
While various embodiments have been described above, it should be understood that they have been presented by way of example only, and not limitation. It will be apparent to persons skilled in the relevant art that various changes in form and detail can be made therein without departing from the spirit and scope of the embodiments. Thus, the breadth and scope of the embodiments should not be limited by any of the above-described example embodiments, but should be defined only in accordance with the following claims and their equivalents.