This disclosure relates generally to systems and methods for mapping virtualized network elements to physical resources in a data center.
Cloud computing has become a rapid growing industry that plays a crucial role in the Information and Communications Technology (ICT) sector. Modern data centers deploy virtualization techniques to increase operational efficiency and enable dynamic resource provisioning in response to changing application needs. A cloud computing environment provides computation, capacity, networking, and storage on-demand, typically through virtual networks and/or virtual machines (VMs). Multiple VMs can be hosted by a single physical server, thus increasing utilization rate and energy efficiency of cloud computing services. Cloud service customers may lease virtual compute, network, and storage resources distributed among one or more physical infrastructure resources in data centers.
A Telco Cloud is an example of a cloud environment hosting telecommunications applications, such as IP Multimedia Subsystem (IMS), Push To Talk (PTT), Internet Protocol Television (IPTV), etc. A Telco Cloud often has a set of unique requirements in terms of Quality of Service (QoS), availability and reliability. While conventional Internet-based cloud hosting systems, like Google, Amazon and Microsoft are server-centric, a Telco Cloud is more network-centric. It contains many networking devices and its networking architecture is often complex with various layers and protocols. The Telco Cloud infrastructure provider may allow multiple Virtual Telecom Operators (VTOs) sharing, purchasing or renting physical network and compute resources of the Telco Cloud to provide telecommunications services to end-users. This business model allows the VTOs to provide their services without having the costs and issues associated with owning the physical infrastructure.
Conventional networking systems utilize a distributed control plane that requires each device and every interface to be managed independently, device by device. They also have a complex array of network protocols. Such architecture is not scalable to efficiently operate in a Cloud, which can contain huge numbers of attached devices, isolated independent subnetworks, multi-tenancy, and VMs. From a broader perspective, in order to support a larger base of consumers from around the world, infrastructure providers have recently established data centers in multiple geographical locations to equally distribute loads, provide redundancy and ensure reliability in case of site failures.
These trends suggest a different approach to the network architecture, in which the control plane logic is handled by a centralized server and the forwarding plane consists of simplified switching elements “programmed” by the centralized controller. Software Defined Networking (SDN) is a new paradigm in network architecture that introduces programmability, centralized intelligence and abstractions from the underlying network infrastructure. A network administrator can configure how a network element behaves based on data flows that can be defined across different layers of network protocols. SDN separates the intelligence needed for controlling individual network devices (e.g., routers and switches) and offloads the control mechanism to a remote controller device (often a stand-alone server or end device). An SDN approach provides complete control and flexibility in managing data flow in the network while increasing scalability and efficiency in the Cloud.
In the context of cloud computing, a “virtual slice” is composed of a number of VMs linked by dedicated flows. This definition addresses both computing and network resources involved in a slice, providing end users with the means to program, manage, and control their cloud services in a flexible way. The issue of creating virtual slices in a data center has not been completely resolved prior to the introduction of SDN mechanisms. SDN implementations to date have made use of centralized or distributed controllers to achieve architecture isolation between different customers, but without addressing the issues surrounding optimal VM location placement, optimal virtual flow mapping, and flow aggregation.
Therefore, it would be desirable to provide a system and method that obviate or mitigate the above described problems.
It is an object of the present invention to obviate or mitigate at least one disadvantage of the prior art.
In a first aspect of the present invention, there is provided a method for assigning virtual network elements to physical resources. The method comprises the steps of receiving a resource request including a plurality of virtual machines and a set of virtual flows, each of the virtual flows connecting two virtual machines in the plurality. Each virtual machine in the plurality of virtual machines is assigned to a physical server in a plurality of physical servers in accordance with at least one allocation criteria. The set of virtual flows is modified to remove a virtual flow connecting two virtual machines assigned to a single physical server. Each of the virtual flows in the modified set is assigned to a physical link.
In an embodiment of the first aspect, the allocation can criteria include maximizing a consolidation of virtual machines into physical servers. The allocation criteria can optionally include minimizing a number of virtual flows required to be assigned to physical links. The allocation criteria can further optionally include comparing a processing requirement associated with at least one of the plurality of virtual machines to an available processing capacity of at least one of the plurality of physical servers.
In another embodiment of the first aspect, the step of assigning each virtual machine in the plurality of virtual machines to a physical server in the plurality of physical servers includes sorting the physical servers in decreasing order according to server processing capacity. A first one of the physical servers can be selected in accordance with the sorted order of physical servers. In some embodiments, the virtual machines can be sorted in increasing order according to virtual machine processing requirement. A first one of the virtual machines can be selected in accordance with the sorted order of virtual machines. The selected virtual machine can then be placed on, or assigned to, the selected physical server. In some embodiments, responsive to determining that a processing requirement of the selected virtual machine is greater than an available processing capacity of the selected physical server, a second of the physical servers can be selected in accordance with the sorted order of physical servers; and the selected virtual machine can be placed on the second physical server.
In another embodiment, the removed virtual flow is assigned an entry in a forwarding table in the single physical server.
In another embodiment, responsive to determining that a bandwidth capacity of a virtual flow is greater than an available bandwidth capacity of a physical link, the virtual flow is assigned to multiple physical links. The multiple physical links can be allocated in accordance with a source physical server, a destination physical server, and the bandwidth capacity associated with the virtual flow.
In a second aspect of the present invention, there is provided a cloud management device comprising a communication interface, a processor, and a memory, the memory containing instructions executable by the processor. The cloud management device is operative to receive a resource request, at the communication interface, including a plurality of virtual machines and a set of virtual flows, each of the virtual flows connecting two virtual machines in the plurality. Each virtual machine in the plurality of virtual machines is assigned to a physical server in a plurality of physical servers in accordance with an allocation criteria. The set of virtual flows is modified to remove a virtual flow connecting two virtual machines assigned to a single physical server. Each of the virtual flows in the modified set is assigned to a physical link.
In an embodiment of the second aspect, the cloud management device can transmit, at the communication interface, a mapping of the virtual machines and the virtual flows to their assigned physical resources.
In another aspect of the present invention, there is a provided a data center manager comprising a computer manager module, a network controller module and a resource planner module. The compute manager module is configured for monitoring server capacity of a plurality of physical servers. The network controller module is configured for monitoring bandwidth capacity of a plurality of physical links interconnecting the plurality of physical servers. The resource planner module is configured for receiving a resource request indicating a plurality of virtual machines and a set of virtual flows; for instructing the compute manager module to instantiate each virtual machine in the plurality of virtual machines to a physical server in the plurality of physical servers in accordance with an allocation criteria; for modifying the set of virtual flows to remove a virtual flow connecting two virtual machines assigned to a single physical server; and for instructing the network controller module to assign each of the virtual flows in the modified set to a physical link in the plurality of physical links.
Other aspects and features of the present invention will become apparent to those ordinarily skilled in the art upon review of the following description of specific embodiments of the invention in conjunction with the accompanying figures.
Embodiments of the present invention will now be described, by way of example only, with reference to the attached Figures, wherein:
The present disclosure is directed to systems and methods for improving the process of resource allocation, both in terms of processing and networking resources, in a cloud computing environment. Based on SDN and cloud network planning technologies, embodiments of the present invention can optimize resource allocations with respect to power consumption and greenhouse gas emissions while taking into account Telco cloud application requirements.
Reference may be made below to specific elements, numbered in accordance with the attached figures. The discussion below should be taken to be exemplary in nature, and not as limiting of the scope of the present invention. The scope of the present invention is defined in the claims, and should not be considered as limited by the implementation details described below, which as one skilled in the art will appreciate, can be modified by replacing elements with equivalent functional elements.
Along with the widespread utilization of virtual networks and VMs in data centers or networks of geographically distributed data centers, a fundamental question for cloud operators is how to allocate/relocate a large number of virtual network slices with significant aggregate bandwidth requirements while maximizing the utilization ratio of their infrastructure. A direct result of an efficient resource allocation solution is to minimize the number of idle servers and unused network links, thus optimizing the power consumption and greenhouse gas emissions of data centers.
In addition to the scalability in terms of the number of resources, a key challenge of the overall resource planning problem is to develop a component which is able to efficiently interact with the existing cloud management modules to collect information and to send commands to achieve the desired resource allocation plan. This process is preferably performed automatically, in a short interval of time, with respect to a large number of cloud customers. An efficient method for mapping virtual resources can help cloud operators increase their revenue while reducing resource and power consumption.
Embodiments of the present invention provide methods for allocating both processing and networking resources for user requests, regarding constraints of infrastructure, the quality of service, and architecture of underlying infrastructure, as well as unique features of cloud computing environment such as resource consolidation and multipath connection.
Conventional solutions in the area of resource allocation in data centers only partially consider optimizing VM locations, virtual flow mapping and flow aggregation. Existing solutions have failed to address the problems associated with combining mapping and consolidation. Additionally, the concept of multipath forwarding has not been considered. Conventional IP routing schemes have been aimed at the “fastest path”, “shortest path” or “best route”. Server consolidation is a substantial factor in achieving energy efficiency in cloud computing, and multipath forwarding is a key element for increasing scalability data center network.
Embodiments of the present invention will be discussed with respect to a Telco Cloud, though it will be appreciated by those skilled in the art that these may be implemented in any variety of data centers and network of data centers including, but not limited to public cloud, private cloud and hybrid cloud.
The example virtual slice 102, as can be specified and requested by a user, includes three VMs 100a-100c (each requiring 2 CPUs processing power) and two virtual flows 101a and 101b (each having a bandwidth capacity of 2 Gbps). The virtual flows 101a and 101b represent communication links that are required between the requested VMs. Virtual flow 101a is shown linking VM 100a to VM 100c and virtual flow 101b links VMs 100b and 100c.
It should be noted that in the example of
Telecommunication applications are often composed of multiple components with a high degree of interdependence between these components. For example, an IP Multimedia Subsystem (IMS) involves Call Session Control Function (CSPF) proxies, Home Subscriber Server (HSS) databases, and several gateways. Continuous interactions among these components are established to provide end-to-end services to users, such as peer messaging, voice, and video streaming. When such an IMS system is deployed in a virtualized data center, a set of VMs and flows between those VMs (defined as a virtual slice) is required.
The Telco Cloud is managed and controlled by a middleware providing networking and computing functions, such as virtual network definition, VM creation, and removal. For example, OpenStack can be deployed to control the Telco Cloud.
A Cloud Resource Planner module 301 is a virtual resource planning entity that interfaces with the Network Controller 302 and the Compute Manager 303 in the data center to collect data of the Cloud network and compute resources. Taking into account multipath connection and consolidation features of server virtualization, the Cloud Resource Planner 301 can compute optimized resource allocation plans with respect to dynamic user requests in terms of network flows and virtual machine capacity, helping a cloud operator improve performance, scalability and energy efficiency. The Cloud Resource Planner 301 module can be implemented and executed as a pluggable component to the data center middleware.
Using the network report 304 and the server report 306, sent respectively by the Network Controller 302 and Compute Manager 303 modules, the Cloud Resource Planner module 301 can compute an optimized resource allocation plan, and then sends commands 305 and 307 back to the Network Controller 302 and Compute Manager 303 in order to allocate physical resources for VMs and virtual flows.
When a plan for server consolidation is found, the process moves to block 356, where a flow assignment algorithm is run. The flow assignment algorithm aims to build an optimal plan for link allocation between the VMs assigned to servers in block 353. In block 357 it is determined if all flows have been mapped to physical links. If no, the user request is determined to be unresolvable (block 355). If yes, an optimized mapping plan has been determined and can be output (block 358).
In block 506 it is determined if server i has enough capacity to host the VM j. This can be determined by comparing the available capacity of Server i to the required capacity of VM j. If yes, a mapping of VM j to Server i will be defined (block 507), and counter j will be incremented. Otherwise, counter i will be incremented and the next server (e.g. Server i+1) in the list will be used (block 508) when the process returns to block 504. The process ends in block 509 when it is determined that all VMs are mapped (e.g. counter j=M) to a physical server. The process can also end in block 510 if no suitable mapping plan can be determined (e.g. if there is insufficient available server capacity to host all requested VMs).
Starting from the source node of the smallest flow (e.g. the flow with the lowest bandwidth requirement, i=0), a Depth First Search (DFS) algorithm will be executed to select intermediate switches (block 412). The DFS algorithm is executed starting from the source edge switch, then goes upstream (block 416). At each intermediate node, the algorithm will try to allocate physical links with the total bandwidth capacity being best-fit to the virtual flow requirement (block 417). If the sum of the bandwidth of all of the physical links does not meet the requirement (block 418), the algorithm backtracks to the previous (e.g. upstream) node 419. This step is looped until either the destination node (block 413) or the source node (block 414) is reached. If the algorithm returns back to the source node (in block 414), the problem is unsolvable and the user request is determined to be unresolvable (block 621). If the destination node is reached (in block 413), the counter i is incremented (block 415) and the algorithm will attempt to map the next virtual flow in the list. The process continues iteratively until it is determined that all flows have been mapped (block 411) and a mapping plan for virtual flows to physical links can be output (block 420).
Those skilled in the art will appreciate that Depth First Search is an exemplary searching algorithm starting at a root node and exploring as far as possible along each branch before backtracking. Other optimization algorithms can be used for optimally mapping virtual flows to physical links without departing from the scope of the present invention. As described above, if it is determined that a single physical path does not meet the bandwidth required for a virtual flow, a multipath solution composed of multiple physical links will be allocated for the flow.
In an optional embodiment, block 710 can include the steps of sorting the physical servers in decreasing order according to their respective server processing capacity, and selecting a first one of the physical servers in accordance with the sorted order of physical servers. The VMs are sorted in increasing order according to their respective processing requirement, and a first one of the virtual machines is selected in accordance with the sorted order of virtual machines. The selected virtual machine is then placed on, or assigned to, the selected physical server. If it is determined that the processing requirement of the selected virtual machine is greater than the available processing capacity of the selected physical server, a second of the physical servers is selected in accordance with the sorted order of physical servers. The selected virtual machine is then assigned to the second physical server.
Following the assignment of the VMs to physical servers, a virtual flow that connect two VMs assigned to a common, single physical server can be identified and removed from the set of virtual flows (block 720). The set of virtual flows needing to be mapped to physical resources can be modified by eliminating all flows connecting VMs assigned to the same physical server. Optionally, a virtual flow that is identified and removed from the set can be added as an entry in a forwarding table in the physical server hosting the connected VMs. A virtual switch (vSwitch) can be provided in the physical server to provide communication between VMs hosted on that server. The vSwitch can include a forwarding table to enable such communication.
Each of remaining virtual flows in the modified set can then be assigned to a physical link, connecting the physical servers to which the VMs associated with the virtual flow have been assigned (block 730). A physical link can be a route composed of multiple sub-links, providing a communication path between the source physical server and destination physical server hosting the VMs.
Optionally, in block 730, it may be determined that a bandwidth requirement of a virtual flow is greater than the available bandwidth capacity of a single physical link. Such a virtual flow can be assigned to two or more physical links between the required source and destination servers in order to satisfy the requested bandwidth requirement. The physical links can encompass connection paths directly between servers, as well as connections that pass through switching elements to route communication between physical servers. A multipathing algorithm can be used to determine the two or more physical links to be assigned a virtual flow.
The modified set of virtual flows can be sorted in increasing order in accordance with their respective bandwidth capacity requirements. A first of the virtual flows can be selected in accordance with the sorted order of virtual flows. A first physical link is allocated in accordance with a source physical server and a destination physical server associated with the virtual flow. The source and destination physical servers being the servers to which the virtual machines connected by the selected virtual flow have been assigned. The first physical link can also be allocated in accordance with the bandwidth capacity requirement of the selected virtual flow. A second physical link can be allocated to meet the bandwidth capacity requirement of the selected virtual flow. Following the assignment of the first selected virtual flow to one or more physical links, a second of the virtual flows can be selected in accordance with the sorted order. The process can continue until all of the virtual flows in the modified set have been assigned to physical links.
The communication interface 806 is configured to send and receive messages. The communication interface 806 receives a request for virtualized resources, including a plurality of VMs and a set of virtual flows indicating a connection between two of the VMs in the plurality. The communication interface 806 can also receive a list of a plurality of physical servers and physical links connecting the physical servers which are available for hosting the virtualized resources. The processor 802 assigns each VM in the plurality to a physical server selected from the plurality of servers in accordance with an allocation criterion. The processor 802 modifies the set of virtual flows to remove any virtual flows linking two VMs which have been assigned to a single physical server. The processor 802 assigns each of the virtual flows in the modified set to a physical link. The processor 802 may determine that a bandwidth of a requested virtual flow is greater than the available bandwidth capacity of any physical link. The processor 802 can assign the virtual flow to multiple physical links to meet the bandwidth requested. When all requested virtual resources have been assigned, the communication interface 806 can transmit a mapping of the virtual resources to their assigned physical resources.
Embodiments of the invention may be represented as a software product stored in a machine-readable medium (also referred to as a computer-readable medium, a processor-readable medium, or a computer usable medium having a computer readable program code embodied therein). The machine-readable medium may be any suitable tangible medium including a magnetic, optical, or electrical storage medium including a diskette, compact disk read only memory (CD-ROM), digital versatile disc read only memory (DVD-ROM) memory device (volatile or non-volatile), or similar storage mechanism. The machine-readable medium may contain various sets of instructions, code sequences, configuration information, or other data, which, when executed, cause a processor to perform steps in a method according to an embodiment of the invention. Those of ordinary skill in the art will appreciate that other instructions and operations necessary to implement the described invention may also be stored on the machine-readable medium. Software running from the machine-readable medium may interface with circuitry to perform the described tasks.
The above-described embodiments of the present invention are intended to be examples only. Alterations, modifications and variations may be effected to the particular embodiments by those of skill in the art without departing from the scope of the invention, which is defined solely by the claims appended hereto.