Benefit is claimed under 35 U.S.C. 119(a)-(d) to Foreign Application Serial No. 201941002031 filed in India entitled “CONSOLIDATION OF IDENTICAL VIRTUAL MACHINES ON HOST COMPUTING SYSTEMS TO ENABLE PAGE SHARING”, on Jan. 17, 2019, by VMware, Inc., which is herein incorporated in its entirety by reference for all purposes.
The present disclosure relates to computing environments, and more particularly to methods, techniques, and systems for consolidating identical virtual machines on host computing systems to enable page sharing.
Computer virtualization may be a technique that involves encapsulating a representation of a physical computing machine platform into a virtual machine (VM) that may be executed under the control of virtualization software running on hardware computing platforms. The hardware computing platforms may also be referred as host computing systems or servers. In such a computing environment, multiple host computing systems may execute different types of virtual machines running therein. An example host computing system may be a physical computer system. A virtual machine can be a software-based abstraction of the physical computer system. Each virtual machine may be configured to execute an operating system (OS), referred to as a guest OS, and applications. Further, two or more virtual machines running on a host computing system may share memory associated with the host computing system to execute applications.
The drawings described herein are for illustration purposes only and are not intended to limit the scope of the present subject matter in any way.
Examples described herein may provide an enhanced computer-based and network-based method, technique, and system for consolidating identical virtual machines on host computing systems to enable page sharing in a data center. The data center may be a virtual data center (e.g., a cloud computing environment, a virtualized environment, and the like). The virtual data center may be a pool or collection of cloud infrastructure resources designed for enterprise needs. The resources may be a processor (e.g., central processing unit (CPU)), memory (e.g., random-access memory (RAM)), storage (e.g., disk space), and networking (e.g., bandwidth). Further, the virtual data center may be a virtual representation of a physical data center, complete with servers, storage clusters, and networking components, all of which may reside in virtual space being hosted by one or more physical data centers.
Further, the data center may include multiple host computing systems executing corresponding virtual machines. Example host computing system may be a physical computer. The virtual machines, in some examples, may operate with their own guest operating systems on a host computing system using resources of the host computing system virtualized by virtualization software (e.g., a hypervisor, virtual machine monitor, and the like). In one example, the host computing systems may include hardware memory (e.g., also referred as physical memory or host physical memory). Further, the virtual machines running on the corresponding host computing systems may share the hardware memory for their operations.
In some examples, the virtualization software may enable sharing memory pages of the hardware memory across virtual machines. For example, multiple virtual machines, running instances of the same guest operating system, may have the same applications or components loaded, and/or contain common data. In such cases, a memory page sharing technique may be used to securely eliminate redundant copies of memory pages in the hardware memory. However, the virtual machines may be deployed on different host computing systems by different personas from different organizations, which may lead to suboptimal distribution of virtual machines (e.g., having different types of applications or operating systems) across multiple host computing systems in the data center. In this case, the memory page sharing mechanism may become ineffective.
Examples described herein may intelligently identify virtual machines with similar or identical configuration and provide a recommendation to optimally distribute the identical virtual machines across multiple host computing systems in a data center. Examples described herein may significantly maximize memory page sharing and create an opportunity to deploy additional virtual machines on a same infrastructure without any impact on applications' performance. For example, consider there are 30 virtual machines having 6 types of different configurations in a data center. Also, consider each type of configuration may have 5 identical virtual machines. Examples described herein may determine the identical virtual machines and place the identical virtual machines on corresponding ones of host computing systems in the data center. In this example, each group of 5 identical virtual machines may be placed on a corresponding host computing system to enhance memory page sharing.
System Overview and Examples of Operation
For example, clusters of host computing systems 102A-102N may be used to support clients for executing various applications. Each cluster can include any number of host computing systems ranging from one to several hundred or more. Each client may be associated with a resource reservation to support application operations. Example client may be a customer, business group, tenant, an enterprise, and the like. In cloud computing environments, a number of virtual machines can be created for each client and resources (e.g., CPU, memory, storage, and the like) may be allocated for each virtual machine to support application operations.
As shown in
Further, management node 104 may include a virtual machine classification unit 106 to retrieve configuration data and resource utilization data associated with virtual machines VM 1 to VM N in data center 100. In one example, the configuration data may include at least one parameter such as virtual machine inventory information (e.g., virtual machine identifier), a guest operating system type and version, memory information, central processing unit (CPU) information, disk drive information, network adapter information, a type and version of an application, host computing system information, host cluster information, configuration settings associated with each of virtual machines VM 1 to VM N, and/or the like. The resource utilization data may include performance metric parameters such as processor utilization data, memory utilization data, network utilization data, storage utilization data, and/or the like.
Furthermore, virtual machine classification unit 106 may perform a cluster analysis on the configuration data and the resource utilization data to generate clusters. In one example, each cluster may include identical virtual machines from virtual machines VM 1 to VM N. Example cluster analysis may include a Gaussian-means (G-means) cluster, a support vector cluster, or the like. In one example, virtual machine classification unit 106 may encode categorial variables associated with the configuration data and the resource utilization data. Further, virtual machine classification unit 106 may perform the cluster analysis on the encoded categorial variables to generate the clusters. For example, each cluster may include identical virtual machines with similar configurations based on at least one parameter selected from the configuration data and the resource utilization data.
As shown in
In one example, virtual machine migration unit 108 may generate a virtual machine migration plan for the identical virtual machines in the clusters based on resources availability associated with host computing systems 102A-102N. Example resources availability may include a processing resource availability, a memory resource availability, a network resource availability, a storage resource availability, or any combination thereof. Further, virtual machine migration unit 108 may recommend (e.g., to a data center administration) the virtual machine migration plan to consolidate the identical virtual machines in each cluster to execute in a corresponding one of host computing systems 102A-102N. Furthermore, based on an instruction from the data center administrator, virtual machine migration unit 108 may migrate the identical virtual machines to consolidate the identical virtual machines in each cluster to execute in the corresponding one of host computing systems 102A-102N in accordance with the recommended virtual machine migration plan.
In another example, virtual machine migration unit 108 may sequentially place the clusters of identical virtual machines on host computing systems 102A-102N during hardware upgrade in data center 100. For example, virtual machine migration unit 108 may place the identical virtual machines in a first cluster of the clusters on a first host computing system (e.g., 102A) during the hardware upgrade in data center 100 such that the physical memory pages are shared by the placed identical virtual machines within first host computing system 102A. Further, virtual machine migration unit 108 may repeat the step of placing the identical virtual machines in a next cluster until the identical virtual machines in all the clusters are placed on corresponding ones of host computing systems (e.g., 102B-102N) in data center 100.
In some examples, the functionalities described herein, in relation to instructions to implement functions of virtual machine classification unit 106, virtual machine migration unit 108, and any additional instructions described herein in relation to the storage medium, may be implemented as engines or modules comprising any combination of hardware and programming to implement the functionalities of the modules or engines described herein. The functions of virtual machine classification unit 106 and virtual machine migration unit 108 may also be implemented by a respective processor. In examples described herein, the processor may include, for example, one processor or multiple processors included in a single device or distributed across multiple devices. In some examples, virtual machine classification unit 106 and virtual machine migration unit 108 can be a part of management software (e.g., vSphere virtual center that is offered by VMware®) residing in management node 104.
As shown in
In the example shown in
Further, consider virtual machines VM 1 to VM 9 may be deployed with different operating systems such as SuSE® Linux™, Windows®, Ubuntu® Linux™, and the like operating systems. Furthermore, consider VM 1, VM 4, and VM 7 are deployed with SuSE® Linux™ operating system. VM 2, VM 5, and VM 8 are deployed with Windows® operating system. VM 3, VM6, and VM 9 are deployed with Ubuntu® Linux™ operating system. In this example, heterogeneous types of operating systems and applications are running on virtual machines VM 1 to VM 3 of same host computing system 202A. Thus, there may not be any common files between virtual machines VM 1 to VM 3 and hence, none of the physical memory pages PPN 1, PPN 2, and PPN 3 of hardware memory 204A may be shared between virtual machines VM 1 to VM 3.
As shown in
In one example, virtual machine migration unit 108 may consolidate the identical virtual machines in each cluster by generating the virtual machine migration plan based on resources availability associated with host computing systems 202A-202C. For example, in the virtual machine migration plan, the identical virtual machines VM 1, VM 4, and VM 7 in the first cluster may be consolidated to execute on a same host computing system such that physical memory pages are shared by the consolidated identical virtual machines VM 1, VM 4, and VM 7. The migration of the identical virtual machines in accordance with the virtual machine migration plan is described in
In one example, physical memory pages associated with hardware memory 204A-204C may be shared by the identical virtual machines as shown in
At 306, a cluster analysis may be performed on the configuration data and the resource utilization data of Table 1 to generate clusters. Each cluster may include identical virtual machines from virtual machines VM 1 to VM N. In one example, categorial variables associated with parameters of the configuration data and the resource utilization data may be encoded and the cluster analysis may be performed on the encoded categorial variables to generate the clusters. For example, the categorical variables such as software name, applications, and the like may be encoded in discrete integer values. In the example, each of the 4 types of software names (e.g., Ubuntu, Windows, CoreOS, and SuSe) may be assigned with one integer value. Further, different parameters may be selectively considered. For example, parameters such as host and host cluster may be used when cluster affinity may be required.
In one example, the cluster analysis may include one of a Gaussian-means (G-means) cluster, a support vector cluster, and the like. G-means clustering algorithm may be an extension of k-means clustering algorithm to determine an appropriate number of clusters (e.g., k). The G-means algorithm may begin with a small number of k-means centers and increases the number of centers. Each iteration of the algorithm may split into two those centers whose data appear not to come from a Gaussian distribution. Further, between each round of splitting, k-means cluster may be applied on the entire dataset. In one example, the value of k may be initialized as 1 or other value may be specified for k if an administrator is aware about the range of k.
Further, G-means repeatedly makes decisions based on a statistical test for the configuration data and the resources utilization data assigned to each center. If the data currently assigned to a k-means center appear to be Gaussian, then the data may be represented with only one center. However, if the same data do not appear to be Gaussian, then multiple centers may be used to model the data. In other words, the G-means algorithm may run k-means multiple times. For example, consider a data set ‘X’ with ‘d’ dimensions belong to center ‘c’. Also, assume that the confidence level for deciding the cluster is ‘α’. The steps for G-means (X, α) may be depicted in Table 2.
At 308, the identical virtual machines in each of the clusters may be consolidated. In one example, there can be one of two possibilities to consolidate the identical virtual machines such as when all the host computing systems in the data center are empty (e.g., during hardware upgrade) and when the host computing systems are already executing corresponding virtual machines.
In one example, the host computing systems in the data center may be empty at the time of data center modernization (e.g., hardware upgrade), where existing host computing systems are being replaced by new ones. In this example, the clusters of identical virtual machines may be sequentially placed on the host computing systems. For example, the identical virtual machines in each of k clusters may be consolidated as follows:
In another example, when the host computing systems in the data center may be executing corresponding virtual machines, a virtual machine migration plan may be generated for the identical virtual machines in the clusters based on resources availability associated with the host computing systems. For example, the administrator may instruct to execute the virtual machine migration plan to migrate the identical virtual machines in each cluster. Further, the administrator may place possible number of identical virtual machines in a cluster on a same host computing system. Thus, examples described herein may generate the virtual machine migration plan and execute the virtual machine migration plan to significantly maximize memory page sharing.
In graphical representation 400, each eclipse (e.g., 402, 404, 406, 408, 410, and 412) represents a cluster including one type of virtual machines, where the identical virtual machines are represented by same symbol. In the example, there are 6 eclipses (e.g., 402, 404, 406, 408, 410, and 412) and hence there can be 6 different types of virtual machines in the lab infrastructure as depicted in Table 3.
Further, in graphical representation 400, it can be observed that some of the data points are overlapping between two clusters indicating the virtual machines have common properties from two clusters. In this example, such virtual machines can be placed on any host computing systems which executes either of the two clusters. Thus, page sharing may be optimized and can achieve significant reduction in memory consumption.
Examples described herein may use machine learning to optimize memory page sharing of virtual machines, which may help private cloud administrators to optimize resources and reduce the cost of running the virtual machines. Examples described herein may be implemented in software solutions related to automatic delivery of new applications and updates like VMware® vRealize Operation as an optimization recommendation flow, where examples described herein may recommend an optimal plan for virtual machine movement across host computing systems in order to maximize memory page sharing. Also, examples described herein may be implemented in software solutions to automate optimization of complete infrastructure, e.g., VMware® vRealize Automation (vRA), where vMotion may be used to migrate virtual machines across the host computing systems without impacting any application.
At 502, configuration data and resource utilization data associated with a plurality of virtual machines in a data center may be retrieved. For example, the configuration data comprises at least one parameter selected from a group consisting of virtual machine inventory information, a guest operating system type and version, memory information, central processing unit (CPU) information, disk drive information, network adapter information, a type and version of an application, host computing system information, host cluster information, and/or configuration settings associated with each of the plurality of virtual machines. The resource utilization data comprises performance metric parameters selected from a group consisting of processor utilization data, memory utilization data, network utilization data, and/or storage utilization data.
At 504, a cluster analysis may be performed on the configuration data and the resource utilization data to generate a plurality of clusters, each cluster comprising identical virtual machines from the plurality of virtual machines. In one example, performing the cluster analysis on the configuration data and the resource utilization data may include encoding categorial variables associated with the configuration data and the resource utilization data and performing the cluster analysis on the encoded categorial variables to generate the plurality of clusters. For example, each cluster may include identical virtual machines with similar configurations based on at least one parameter selected from the configuration data and the resource utilization data. Further, the cluster analysis may include one of a G-means cluster, a support vector cluster, and the like.
At 506, for each cluster, the identical virtual machines in a cluster may be consolidated to execute in a host computing system such that physical memory pages are shared by the consolidated identical virtual machines in the cluster. In one example, the physical memory pages including identical content may be shared by the identical virtual machines using a page sharing mechanism.
In one example, consolidating the identical virtual machines in each cluster may include generating a virtual machine migration plan for the identical virtual machines in the plurality of clusters based on resources availability associated with a plurality of host computing systems in the data center. For example, the resources availability may include a processing resource availability, a memory resource availability, a network resource availability, a storage resource availability, or any combination thereof. Further, the virtual machine migration plan may be recommended to consolidate the identical virtual machines in each cluster to execute in a corresponding one of the host computing systems. Furthermore, the identical virtual machines may be migrated to consolidate the identical virtual machines in each cluster to execute in the corresponding one of the host computing systems in accordance with the recommended virtual machine migration plan.
In another example, consolidating the identical virtual machines in each cluster may include sequentially place the clusters of identical virtual machines on a plurality of host computing systems during hardware upgrade in the data center. For example, sequentially placing the clusters of identical virtual machines on the plurality of host computing systems during the hardware upgrade may include placing the identical virtual machines in a first cluster of the plurality of clusters on a first host computing system during the hardware upgrade in the data center such that the physical memory pages are shared by the placed identical virtual machines within the first host computing system. Further, the step of placing the identical virtual machines in a next cluster may be repeated until the identical virtual machines in all the clusters are placed on corresponding host computing systems in the data center.
Machine-readable storage medium 604 may store instructions 606-610. In an example, instructions 606-610 may be executed by processor 602 for consolidate identical virtual machines on host computing systems to enable page sharing. Instructions 606 may be executed by processor 602 to retrieve configuration data and resource utilization data associated with a plurality of virtual machines in a data center. Instructions 608 may be executed by processor 602 to perform a cluster analysis on the configuration data and the resource utilization data to generate a plurality of clusters, each cluster comprising identical virtual machines from the plurality of virtual machines. Further, instructions 610 may be executed by processor 602 to consolidate, for each cluster, the identical virtual machines in a cluster to execute in a host computing system such that physical memory pages are shared by the consolidated identical virtual machines in the cluster.
Some or all of the system components and/or data structures may also be stored as contents (e.g., as executable or other machine-readable software instructions or structured data) on a non-transitory computer-readable medium (e.g., as a hard disk; a computer memory; a computer network or cellular wireless network or other data transmission medium; or a portable media article to be read by an appropriate drive or via an appropriate connection, such as a DVD or flash memory device) so as to enable or configure the computer-readable medium and/or one or more host computing systems or devices to execute or otherwise use or provide the contents to perform at least some of the described techniques.
It may be noted that the above-described examples of the present solution are for the purpose of illustration only. Although the solution has been described in conjunction with a specific embodiment thereof, numerous modifications may be possible without materially departing from the teachings and advantages of the subject matter described herein. Other substitutions, modifications and changes may be made without departing from the spirit of the present solution. All of the features disclosed in this specification (including any accompanying claims, abstract and drawings), and/or all of the steps of any method or process so disclosed, may be combined in any combination, except combinations where at least some of such features and/or steps are mutually exclusive.
The terms “include,” “have,” and variations thereof, as used herein, have the same meaning as the term “comprise” or appropriate variation thereof. Furthermore, the term “based on”, as used herein, means “based at least in part on.” Thus, a feature that is described as based on some stimulus can be based on the stimulus or a combination of stimuli including the stimulus.
The present description has been shown and described with reference to the foregoing examples. It is understood, however, that other forms, details, and examples can be made without departing from the spirit and scope of the present subject matter that is defined in the following claims.
Number | Date | Country | Kind |
---|---|---|---|
201941002031 | Jan 2019 | IN | national |