The present disclosure is generally directed to cloud-computing, and more specifically to routing virtual machines to data centers, as well as placement of virtual machines into physical machines in a data center, within a cloud-based network.
One of the challenges for network service providers is effective resource management. The resources of any computing environment (e.g., processing power, memory, data storage) are finite and constrained. While certain resources may be generally available, other resources may only be available to certain components of a network. For example, in a cloud-computing environment processors and memory are typically confined to individual physical machines and can be shared only locally; while data storage is often provided as a pooled service where multiple physical machines can access and share the storage capacity. As such, an on-going challenge for network service providers is to determine how to efficiently allocate and utilize resources in an inherently dynamic, complex, and heterogeneous cloud-computing environment.
A virtual machine is an instance of an operating system along with one or more applications running in an isolated partition within a computer. For the purposes of the description herein, a virtual machine can be viewed as a processing job requiring certain amounts of computing resources of different types. Virtual machines may be employed in a cloud-computing environment to enable resource sharing and reconfigurations of cloud-computing systems and networks. Virtual machines can share processor and memory resources by residing on a common physical machine, and can be resized (e.g., change the amounts of resources that they require) and migrated (e.g., to other physical machines) based on load-balancing and/or other requirements. As such, the flexibility of virtual machines can allow communication service providers to offer customers processing and storage services in a pay-as-you-go manner while allocating resources more efficiently. Moreover, methods to optimize the deployment of virtual machines in a cloud-computing environment may contribute further to meeting network service provider load-balancing and/or other requirements.
Methods and apparatuses for real-time adaptive placement of virtual machines within a cloud-based network are provided. In accordance with an embodiment, a method for real-time adaptive placement of a virtual machine comprises receiving a virtual machine request at a routing component. The virtual machine request is routed to a target data center determined from a plurality of data centers based on a data center index calculation, wherein the data center calculation is based on a current state of virtual queues associated with the plurality of data centers. In response to routing the virtual machine to the target data center, virtual queues and configuration usage fractions associated with the plurality of data centers may be updated.
In accordance with an embodiment, each of the plurality of data centers may include one or more physical machines to host one or more virtual machines. A designated configuration for physical machines may be determined based on information from the virtual machine and a configuration index, and the virtual machine may be routed within the target data center to a physical machine associated with the designated configuration.
In accordance with an embodiment, a maximum average fraction of physical machines in use may be minimized for the plurality of data centers.
In accordance with an embodiment, a maximum average utilization may be minimized for the plurality of data centers, wherein utilization of a data center is the maximum of an average fraction of physical machines in use and an average utilization of one or more resource pools shared across the data center.
These and other advantages of the invention will be apparent to those of ordinary skill in the art by reference to the following detailed description and the accompanying drawings.
Real-time adaptive placement of virtual machines can minimize maximum resource utilization across a plurality of data centers, such as in a cloud-based network. Real-time adaptive placement (also referred to herein as virtual machine placement) is a combined virtual machine to data center routing and virtual machine to physical machine assignment technique, and can dynamically account for constraints on the allocation of virtual machines to host physical machines within a data center.
Real-time adaptive placement of virtual machines allows for virtual machine routing decisions to dynamically adjust based on changes in virtual machine demand rates, changes in system parameters and other factors. In one embodiment, virtual machine placement includes routing a virtual machine to one of a plurality of data centers. A data center can include one or more physical machines that potentially can host multiple virtual machines simultaneously up to a limit determined by physical machine resources. For example, the total resource requirements of all virtual machines assigned to a physical machine cannot exceed the resource amounts at the physical machine.
Data center 106 includes data center routing component 114 to receive virtual machines and to route virtual machines to physical machines 116, 118 and 120. Physical machines 116, 118 and 120 may then host (i.e., provide resources to) virtual machines, such as virtual machine 102.
Particular data centers may include particular types of physical machines, e.g., physical machines that include one or more particular resources. For example, a physical machine resource may be associated with one of processing, memory and disk storage space. Moreover, a resource may be associated with one of an individual physical machine and a data center shared resource pool. For example, a particular resource, such as disk storage, may exist as a pooled resource generally associated with a data center and accessible by one or more physical machines within the data center, such as shared resource 122.
In cloud-based network 100, virtual machine 102 is received by routing component 104. In one embodiment, virtual machine placement generally includes two determinations: (i) a network routing determination to direct a virtual machine received at a network component to a particular target data center; and (ii) a data center routing determination to assign the received virtual machine to a particular physical machine within the target data center. In the first determination, a target data center (e.g., data center 106) is determined from a plurality of data centers based on a data center index calculation, as described in detail below. In the second determination, a designated configuration for physical machines is determined, e.g., by data center routing component 114, based on information from the received virtual machine and a configuration index calculation, also described in detail below. The virtual machine is then routed (e.g., by data center routing component 114) within the target data center to a physical machine associated with the designated configuration, such as one of physical machine 116, 118 and 120.
After the virtual machine is assigned to a particular physical machine, the virtual machine is processed utilizing the resources of the physical machine and, if necessary, shared resources (e.g., shared resource 122). The virtual machine leaves the physical machine after service is completed and releases the allocated resources, allowing the physical machine to process other virtual machines. In particular, there may be several classes of virtual machines (i.e., processing jobs), indexed by iεI={1, . . . , 1}. Class i virtual machines may arrive at a rate λi. Each class i virtual machine requires computing resources of different types when it is served, e.g., an amount aik>0 of resources k=1, . . . , K. When virtual machine i is placed for service (i.e., allocated a required amount of resources), its average service (i.e., mean processing) time is 1/μi. After the service is completed, resources allocated to the virtual machine are released and the virtual machine leaves the physical machine.
Cloud-based network 100 includes a plurality of data centers (i.e., data centers 106, 108, 110 and 112), which can be represented mathematically by j (e.g., DC j). As described above, data center resources (also referred to herein as resources) k=1, . . . , K, may include pooled (i.e., shared) resources kεKp={1, . . . , K′}, and resources localized to particular physical machines, kεKl={K′+1, . . . , K}, within a data center. For example, DC j may include a total amount βjk>0 of a pooled resource kεKp, and β*j physical machines, each of which has an amount Ajk>0 of a localized resource kεKl.
A class i virtual machine routed to DC j can be further routed to one of the physical machines within DC j where aik localized resources are allocated (if they are still available at that particular physical machine), and aik pooled resources are allocated (if they are still available at DC j). Therefore, a physical machine in DC j can simultaneously serve a number of different virtual machines given by a configuration vector s=(si, i=1, . . . , I) if Σisiaik≦Ajk, for all kεKl. Such configuration vectors thereafter will be referred to feasible configurations for DC j. A feasible configuration s is called maximal, if there is no other feasible configuration s′ such that si≦si′ for all VM types i and si<si′ for at least one i. The set of all maximal feasible configurations for DC j will be denoted by Sj. A subset Ŝj of Sj is called a reduced set of maximal feasible configurations, if for any sεSj there are s1, s2, . . . , snεŜj and a set of non-negative numbers w1, w2, . . . , wn such that w1+w2+ . . . +wn=1 and s≦s1w1+s2w2+ . . . +snwn,
In one embodiment, a virtual machine may require resources including disk storage, processing power (i.e., CPU) and memory, which can be indexed by k=1, 2 and 3, respectively. For example, disk storage may be a pooled resource, while processing and memory resources may be localized resources (i.e., K=3, K′=1, K p={1}, Kl={2, 3}). As such, for DC j, physical machine utilization may be the fraction or percentage of physical machines that are non-idle within DC j, and resource utilization for each pooled resource k may be the fraction or percentage of the resource that is in use within DC j.
Therefore, by determining physical machine and resource utilization variables, virtual machine placement can be implemented with an objective of load balancing of various utilizations at a plurality of data centers within a cloud-based network. For example, a maximum of all average physical machine utilizations and all average resource utilizations across a plurality of data centers can be minimized.
In one embodiment, virtual queues may be maintained and updated, and virtual machine placement may be based on a status of virtual queues. For example, DC j may have associated virtual queues, (j, k), kεKp, and (j, i), iεI; where queue length is denoted by Qjk and respectively. When a virtual machine is received, a virtual machine class (i.e., type) is determined (e.g., a class i virtual machine), and the virtual machine is routed to a particular data center, e.g., DC m. As such, an amount of virtual work aik/(βmkμi) is placed into virtual queue (m, k), such that Qmk:=Qmk+aik/(βmkμi), and an amount of virtual work 1/(β*mμi) is placed into virtual queue (m, i).
In determining a target data center for a received virtual machine, Ri may denote a subset of data centers where at least one class i virtual machine can fit into a physical machine, i.e. aik≦Ajk for all k□Kl. In one embodiment, for each received virtual machine (e.g., a class i virtual machine), a target data center DC m is determined by a data center index,
The virtual machine is then routed (e.g., by a routing component) to DC m, and queues for DC m are updated such that Qmk:=Qmk+aik/(βmkμi), for all kεKp, and Qmi:=Qmi+1/(β*mμi).
Next, for each DC j, a configuration
is determined, wherein Ŝj is a reduced set of feasible configurations for DC j defined earlier. If condition,
holds, then queues, Qjk and Qji′are updated such that Qjk:=max{Qjk−c, 0}, for all j and kεKp, and Qji′:=max{Qji′−cσji′, 0}, for all j and iεI. Parameter c is such that c>maxij maxkεKp aik/(βjkμi) and c>maxij1/(β*jμi). Here the parameter η is some sufficiently small positive number, related to other parameters. In one embodiment, the value of η can be chosen as η=g/(cJ(K′+I)), where g can be chosen as 2, 5, 10.
Moreover, for each DC j, configuration usage fractions, {circumflex over (φ)}sj, are updated such that {circumflex over (φ)}sj:=θI(s, σj)+(1−θ) {circumflex over (φ)}sj, for all j and sεŜj, where I(s, σj)=1 if s was the configuration σj determined above and condition η
holds, and I(s, σj)=0 otherwise. In one embodiment, the configuration usage fractions {circumflex over (φ)}ij may be utilized to assign a virtual machine to a physical machine within a target data center. Here parameter θ is some sufficiently small positive number. In one embodiment it can be θ=0.01.
In one embodiment, a virtual machine routed to a target data center DC j assigned to a physical machine within the target data center. Each non-empty physical machine within DC j at any given time has a designated configuration sεŜj wherein a designation s=(s1, . . . , sI) means that no more than si class i virtual machines may be placed into the physical machine. A physical machine with a designation s can be referred to as an s-physical machine (s-PM). Empty physical machines do not have a designation. A physical machine designation, once assigned, is maintained until the physical machine is empty (i.e., does not currently host any virtual machines). The total quantity zji(s) of class i virtual machines in s-PMs (within DC j) is maintained for each sεŜj. In addition, the quantities {circumflex over (φ)}sj (only for the DC j), need to be known.
When DC j receives a class i virtual machine, configuration index s′ε arg min {sεŜj:si>0} zji(s)/[si {circumflex over (φ)}sj] is determined, and the virtual machine is placed into an s′-PM. For example, among s′-PMs a physical machine with the maximal number of existing virtual machines is selected, but only such that the existing number of class i virtual machines is less than si (e.g., such that the new class i virtual machine can be accommodated). The class i virtual machine is then assigned to the selected physical machine. If no such s′-PM is available, the class i virtual machine is placed into an empty physical machine, which is then designated as an s′-PM.
In an alternative embodiment, information from a target data center routing determination may not be required for a physical machine assignment within the target data center. For example, each localized resource k at DC j may be assumed to be a pooled resource whose total amount βjk is equal to the total amount of the resource in all of the physical machines, e.g., βjk=β*jAjk. As such, K′=K, i.e. Kp={1, . . . , K} containing all resource types, and Kl is empty. A virtual machine then may be routed to a target data center by determining a data center index,
The virtual machine is then routed to DC m, and queues for DC m, Qmk:=Qmk+aik/(βmk μi), for all kεKp, is updated. Next, if condition
holds, then queues, Qjk are updated such that Qjk:=max{Qjk−c, 0}, for all j and kεKp.
Next, on each DC j, wherein Kl is a set of localized resources and there is a (small) parameter η>0, parameter c>0, and (small) parameter θ>0, for each received class i virtual machine into DC j, a queue Qji is updated such that Qji:=Qji+1/(βjμi). A configuration,
is then determined. If condition ηΣi σjiQji≧1 is true, then Qji and {circumflex over (φ)}sj are updated such that Qji:=max{Qji−c σji, 0}, for each i, and {circumflex over (φ)}sj:=θI(s, σj)+(1-θ) {circumflex over (φ)}sj, for all sεŜj, where I(s, σj)=1 if s was the configuration determined by σj and condition ηΣi σjiQji≧1 is true, and I(s, φj)=0 otherwise.
Configuration index s′
is then determined. Among s′-PMs, a physical machine having a maximum number of existing virtual machines is selected, but such that the existing number of class i virtual machines is less than si (e.g., such that the received class i virtual machine can be accommodated). The class i virtual machine is then assigned to the selected physical machine. If no such s′-PM is available, the class i virtual machine is routed to an empty physical machine, which is then designated as an s′-PM.
At 204, a target data center is determined from the plurality of data centers based on a data center index calculation. For example, the data center index may be calculated as discussed above at routing component 104. The virtual machine is then routed to the target data center by routing component 104 at 206.
In response to routing the virtual machine to the target data center, a virtual queue, representing one or more virtual machines received at routing component 104, may be updated at 208. Configuration usage fractions associated with the data centers (e.g., data centers 106, 108, 110 and 112) also may be updated at 208.
At 210, a designated configuration for physical machines is determined, e.g., by data center routing component 114, based on information from the received virtual machine and a configuration index. For example, the target data center (e.g., data center 106) may include a plurality of physical machines to host one or more virtual machines. The configuration index may be calculated as discussed above externally at routing component 104 or, alternatively within a data center (e.g., data center 106) by data center routing component 114.
At 212, the virtual machine is routed (e.g., by data center routing component 114) within the target data center to a physical machine associated with the designated configuration, and the virtual machine is serviced at 214, i.e., allocated required localized resources at a physical machine and pooled resources shared within the data center.
In the various embodiments there is no need to know a priori, or explicitly measure, virtual machine receive rates, as the method can adapt automatically to changes in virtual machine receive rates. Virtual machine placement as described herein allows for the average maximum utilization to be minimized for the plurality of data centers, wherein data center utilization includes the maximum of the fraction of physical machines in use and the used fractions of all pooled resources.
Systems, apparatus, and methods described herein may be implemented using digital circuitry, or using one or more computers using well-known computer processors, memory units, storage devices, computer software, and other components. Typically, a computer includes a processor for executing instructions and one or more memories for storing instructions and data. A computer may also include, or be coupled to, one or more mass storage devices, such as one or more magnetic disks, internal hard disks and removable disks, magneto-optical disks, optical disks, etc.
Systems, apparatus, and methods described herein may be implemented using a computer program product tangibly embodied in an information carrier, e.g., in a non-transitory machine-readable storage device, for execution by a programmable processor; and the method steps described herein, including one or more of the steps of
A high-level block diagram of an exemplary computer that may be used to implement systems, apparatus and methods described herein is illustrated in
Processor 310 may include both general and special purpose microprocessors, and may be the sole processor or one of multiple processors of computer 300. Processor 310 may comprise one or more central processing units (CPUs), for example. Processor 310, data storage device 320, and/or memory 330 may include, be supplemented by, or incorporated in, one or more application-specific integrated circuits (ASICs) and/or one or more field programmable gate arrays (FPGAs).
Data storage device 320 and memory 330 each comprise a tangible non-transitory computer readable storage medium. Data storage device 320, and memory 330, may each include high-speed random access memory, such as dynamic random access memory (DRAM), static random access memory (SRAM), double data rate synchronous dynamic random access memory (DDR RAM), or other random access solid state memory devices, and may include non-volatile memory, such as one or more magnetic disk storage devices such as internal hard disks and removable disks, magneto-optical disk storage devices, optical disk storage devices, flash memory devices, semiconductor memory devices, such as erasable programmable read-only memory (EPROM), electrically erasable programmable read-only memory (EEPROM), compact disc read-only memory (CD-ROM), digital versatile disc read-only memory (DVD-ROM) disks, or other non-volatile solid state storage devices.
Input/output devices 350 may include peripherals, such as a printer, scanner, display screen, etc. For example, input/output devices 350 may include a display device such as a cathode ray tube (CRT), plasma or liquid crystal display (LCD) monitor for displaying information to the user, a keyboard, and a pointing device such as a mouse or a trackball by which the user can provide input to computer 300.
Any or all of the systems and apparatus discussed herein, including routing component 104, data center routing component 114, and physical machines 116, 118 and 120 may be implemented using a computer such as computer 300.
One skilled in the art will recognize that an implementation of an actual computer or computer system may have other structures and may contain other components as well, and that
The foregoing Detailed Description is to be understood as being in every respect illustrative and exemplary, but not restrictive, and the scope of the invention disclosed herein is not to be determined from the Detailed Description, but rather from the claims as interpreted according to the full breadth permitted by the patent laws. It is to be understood that the embodiments shown and described herein are only illustrative of the principles of the present invention and that various modifications may be implemented by those skilled in the art without departing from the scope and spirit of the invention. Those skilled in the art could implement various other feature combinations without departing from the scope and spirit of the invention.
Number | Name | Date | Kind |
---|---|---|---|
8667490 | van der Goot | Mar 2014 | B1 |
20110107332 | Bash | May 2011 | A1 |
20120042061 | Ayala et al. | Feb 2012 | A1 |
20120239376 | Kraft et al. | Sep 2012 | A1 |
20130067469 | Das et al. | Mar 2013 | A1 |
20130073730 | Hansson et al. | Mar 2013 | A1 |
20130297964 | Hegdal et al. | Nov 2013 | A1 |
20140149493 | Acer et al. | May 2014 | A1 |
Number | Date | Country | |
---|---|---|---|
20140189707 A1 | Jul 2014 | US |