The networking industry is working on solutions and technologies for network virtualization. Network virtualization allows the deployment of “virtual networks,” which are logical abstractions of physical networks. A virtual network can provide Layer 2 (L2) or Layer 3 (L3) network services to a set of “tenant systems.” (“Layer 2” and “Layer 3” here refer to layers in the well-known Open Systems Interconnection (OSI) model.)
Virtual networks, which may also be referred to as Closed User Groups, are a key enabler for “virtual data centers,” which provide virtualized computing, storage, and network services to a “tenant.” A virtual data center is associated with a single tenant, thus isolating each tenant's computing and traffic, and can contain multiple virtual networks and tenant systems connected to these virtual networks.
Multiple standardization organizations are involved in the development of solutions for network virtualization, including groups known as OpenStack, ONF (open network forum), the Internet Engineering Task Force (IETF), etc. In the IETF, these activities are taking place in the NVO3 working group, which has defined a network virtualization overlay framework. An IETF document, “Framework for DC Network Virtualization,” (referred to hereinafter as “NVO3 Framework”), describes this framework and may be found at http://tools.ietf.org/html/draft-ietf-nvo3-framework-03 (last accessed November 2014). Another IETF document, “An Architecture for Overlay Networks (NVO3),” (referred to hereinafter as “NVO3 Architecture”), provides a high-level overview architecture for building overlay networks in NVO3, and may be found at http://tools.ietf.org/html/draft-narten-nvo3-arch-00 (last access November 2014). This document generally adopts the terminology used and defined in the NVO3 Framework and NVO3 Architecture documents. However, it should be appreciated that the terminology may change as solutions are developed and deployed. Thus, the use herein of terms that are particular to the NVO Framework as currently defined should be understood as referring more generally to the functionality, apparatus, etc., that correspond to each term. Definitions for many of these terms may be found in the NVO3 Framework and NVO3 Architecture documents. It should be further appreciated that the techniques, apparatus, and solutions described herein are not necessarily limited to systems and/or solutions that comply with present or future IETF documents, but are more generally applicable to systems and solutions that have corresponding or similar components, functionalities, and features, to the extent that those components, functionalities, and features are relevant to the techniques and solutions described below.
The NVO3 working group (WG) was created early in 2012. The goal of the WG is to develop the multi-tenancy solutions for data centers (DCs), especially in the context of data centers supporting virtualized hosts known as virtual machines (VMs). An NVO3 solution (known here as a Data Center Virtual Private Network (DCVPN)) is a virtual private network (VPN) that is viable across a scaling range of a few thousand VMs to several million VMs, running on as many as one hundred thousand or more physical servers. NVO3 solutions have good scaling properties, from relatively small networks to networks with several million DCVPN endpoints and hundreds of thousands of DCVPNs within a single administrative domain. A DCVPN also supports VM migration between physical servers in a sub-second timeframe, and supports connectivity to traditional hosts.
The NVO3 WG will consider approaches to multi-tenancy that reside at the network layer, rather than using traditional isolation mechanisms that rely on the underlying layer 2 technology (e.g., VLANs). The NVO3 WG will determine the types of connectivity services that are needed by typical DC deployments (for example, IP and/or Ethernet).
Currently, the NVO3 WG is working on the DC framework, the requirements for both control plane protocol(s) and data plane encapsulation format(s), and a gap analysis of existing candidate mechanisms. In addition to functional and architectural requirements, the NVO3 WG will develop management, operational, maintenance, troubleshooting, security and OAM protocol requirements. The NVO3 WG will investigate the interconnection of the Data Center VPNs and their tenants with non-NVO3 IP network(s) to determine if any specific work is needed.
In this document, the IETF NVO3 framework is used as a base of telecom-cloud network discussion. However, the techniques described herein may be understood more generally, i.e., without the limitation of network virtualization overlay based on layer 3.
So far, the scope of the NVO3 WG efforts is limited to documenting a problem statement, the applicability, and an architectural framework for DCVPNs within a data center environment. NVO3 WG will develop requirements for both control plane protocol(s) and data plane encapsulation format(s) for intra-DC and inter-DC connectivity, as well as management, operational, maintenance, troubleshooting, security and OAM protocol requirements.
As noted above, in the NVO3 architecture, a Network Virtualization Authority (NVA) 130 is a network entity that provides reachability and forwarding information to NVEs 120. An NVA 130 is also known as a controller. A Tenant System can be attached to a Network Virtualization Edge (NVE) 120, either locally or remotely. The NVE 120 may be capable of providing L2 and/or L3 service, where an L2 NVE 120 provides Ethernet LAN-like service and an L3 NVE 120 provides IP/VRF-like service.
The NVE 120 handles the network virtualization functions that allow for L2 and/or L3 tenant separation and for hiding tenant addressing information (MAC and IP addresses), tenant-related control plane activity and service contexts from the underlay nodes. NVE components may be used to provide different types of virtualized network services. The NVO3 architecture allows IP encapsulation or MPLS encapsulation. However, both L2 and L3 services can be supported.
According to the latest IETF discussions, it is recommended to have the NVE function embedded in a hypervisor, while co-locating the NVA with the VM orchestration. With these recommendations, it is not necessary to have NVE-NVE control signaling. The address mapping table used by the NVE 120 can be configured by the NVA 130. Goals of designing a NVA-NVE control protocol are to eliminate user plane flooding and to avoid an NVE-NVE control protocol. The NVEs 120 can use any encapsulation solution for the data plane tunneling.
As discussed above, an NVE 120 is the network entity that sits at the edge of an underlay network and implements L2 and/or L3 network virtualization functions. The network-facing side of the NVE 120 uses the underlying L3 network to tunnel frames to and from other NVEs 120. The tenant-facing side of the NVE sends and receives Ethernet frames to and from individual Tenant Systems 110. An NVE 120 can be implemented as part of a virtual switch within a hypervisor, a physical switch or router, a Network Service Appliance, or can be split across multiple devices.
A Virtual Network (VN) is a logical abstraction of a physical network that provides L2 or L3 network services to a set of Tenant Systems. A VN is also known as a Closed User Group (CUG). Virtual Network Instance (VNI) is a specific instance of a VN.
While progress has been made in the NVO3 WG, detailed solutions for network virtualization overlays are needed. In particular, solutions that enable load balancing are needed.
According to several of the techniques disclosed herein and detailed below, a load balancing (LB) function is integrated into an NVE function. This LB function, residing in the NVE, is configured by an NVA over a new NVA-NVE protocol. The NVA can thus enable or disable the LB function for a given VN in a specific NVE. The NVE shall be configured with a LB address, which is either an IP address or a MAC address, for LB traffic distribution. Different LB factors, LB algorithm, etc., can be applied, based on the needs.
When the LB function is enabled or disabled in the NVE, the NVA shall update the inner-outer address mapping in the remote NVEs in order to allow the LB traffic to be sent to the LB-enabled NVE. Upon VM mobility, the NVA shall disable the LB function in the old NVE and enable the LB function in the new NVE. The NVA shall also update the remote NVE to redirect LB traffic to the right NVE.
Supporting an integrated LB function in the NVO3 architecture allows the NVE to provide more flexibility when configuring a NVO3 network. When detecting a duplicated address error, the NVE will not be confused, as it has the knowledge why the duplicated addresses are configured.
Several of the methods disclosed herein are suitable for implementation in an NVA in a network virtualization overlay. According to one example method, the NVA receives Virtual Machine (VM) configuration information from a VM orchestration system. Based on this information, the NVA configures an attached NVE (a “first” NVE) to enable Load Balancing (LB), by sending an LB enable message to the NVE. The NVA subsequently receives a confirmation message from the NVE, indicating that the LB function in the NVE is enabled. The NVA then updates remote NVEs, allowing LB traffic to be sent to the first NVE.
According to another method, an NVA in a network virtualization overlay determines that the LB function should be disabled in a first NVE. The NVA configures the NVE to disable the LB function, by sending an LB disable message to the NVE. After receiving confirmation from the NVE that the LB function is disabled, the NVA updates remote NVEs to disallow sending of LB traffic to the first NVE.
According to another method, an NVA in a network virtualization overlay determines, for example, that VM mobility is needed. The NVA configures an “old” NVE, which is currently handling a LB function, to disable the LB function, by sending a LB disable message to the old NVE. After receiving confirmation from the old NVE, the NVA configures a “new” NVE to enable the LB function, by sending an LB enable message to the new NVE. After receiving confirmation from the new NVE that the LB function is enabled, the NVA updates remote NVAs to redirect LB traffic to the new NVE.
Corresponding methods are carried out in NVEs configured according to the presently disclosed techniques. In an example method, an NVE in a network virtualization overlay receives an LB enable message from an NVA. The NVE enables the LB function, and confirms this enabling by sending a confirmation message to the NVA. Subsequently, the NVE receives incoming packets with a LB address (e.g., an LB IP address). The NVE uses the LB address to find the appropriate virtual network (VN) context, from which it determines a specified LB algorithm. The NVE obtains a VM MAC address for each packet, based on the LB algorithm, and forwards the packets according to the VM MAC addresses.
Variants of these methods, as well as corresponding apparatus, are disclosed in detail in the discussion that follows.
In the following, specific details of particular embodiments of the presently disclosed techniques and apparatus are set forth for purposes of explanation and not limitation. It will be appreciated by those skilled in the art that other embodiments may be employed apart from these specific details. Furthermore, in some instances detailed descriptions of well-known methods, nodes, interfaces, circuits, and devices are omitted so as not to obscure the description with unnecessary detail. Those skilled in the art will appreciate that the functions described may be implemented in one or in several nodes.
Some or all of the functions described may be implemented using hardware circuitry, such as analog and/or discrete logic gates interconnected to perform a specialized function, ASICs, PLAs, etc. Likewise, some or all of the functions may be implemented using software programs and data in conjunction with one or more digital microprocessors or general purpose computers. Moreover, the technology can additionally be considered to be embodied entirely within any form of computer-readable memory, including non-transitory embodiments such as solid-state memory, magnetic disk, or optical disk containing an appropriate set of computer instructions that would cause a processor to carry out the techniques described herein.
Hardware implementations may include or encompass, without limitation, digital signal processor (DSP) hardware, a reduced instruction set processor, hardware (e.g., digital or analog) circuitry including but not limited to application specific integrated circuit(s) (ASIC) and/or field programmable gate array(s) (FPGA(s)), and (where appropriate) state machines capable of performing such functions.
In terms of computer implementation, a computer is generally understood to comprise one or more processors or one or more controllers, and the terms computer, processor, and controller may be employed interchangeably. When provided by a computer, processor, or controller, the functions may be provided by a single dedicated computer or processor or controller, by a single shared computer or processor or controller, or by a plurality of individual computers or processors or controllers, some of which may be shared or distributed. Moreover, the term “processor” or “controller” also refers to other hardware capable of performing such functions and/or executing software, such as the example hardware recited above.
References throughout the specification to “one embodiment” or “an embodiment” mean that a particular feature, structure, or characteristic described in connection with an embodiment is included in at least one embodiment of the present invention. Thus, the appearance of the phrases “in one embodiment” or “in an embodiment” in various places throughout the specification are not necessarily all referring to the same embodiment. Further, the particular features, structures or characteristics may be combined in any suitable manner in one or more embodiments.
With the NVO3 architecture described in the NVO3 Architecture document discussed above, a NVA-NVE control plane protocol is needed for NVE configuration and notifications. Another Hypervisor-NVE control plane protocol is also needed for notifications of VN connection and disconnection, as well as for notifications of virtual network interface card (vNIC) association and disassociation.
According to ongoing IETF discussions, it has also been identified that error handling shall also be supported by the NVE, such error handling to include detection of duplicated address detection. There are possibilities that multiple tenant systems of a given virtual network have been misconfigured with the same address by the VM orchestration system. If the two tenant systems have been located in different hypervisors under the same NVE, the hypervisors may not be able to detect this error. As a result, the vNIC association notifications will be sent to the attached NVE. When the NVE receives the vNIC association notifications, it shall verify the received information with the vNIC table of the VN context. If the same vNIC address is found, the misconfiguration can be detected.
However there is at least one exception case that causes a problem with this approach. A load balance (LB) function may be enabled in the virtual network, where the same address is configured for multiple network devices or VMs on purpose. In a cloud network, the LB function is normally provided by the VM function, e.g., a VM running an LB function to distribute the data traffic to different network servers. Alternatively, the LB function can be supported at the data center (DC) fabric network. In either case, it is not possible for the NVE to detect whether the duplicated address is by misconfiguration or by LB function.
Other problems may arise from providing the LB function in a VM or at the DC fabric network. For instance, the DC fabric network can only apply the LB function on the tunneled VM data traffic, or on the un-tunneled traffic using a specific network device. Performing the LB on the un-tunneled packets using a VM or network device will reduce the NOV3 network performance. Performing the LB on the tunneled packets only provides LB between NVEs. It cannot support LB on the tenant system data traffic in many cases, e.g., when data traffic is encrypted. Still further, there is not enough flexibility when configuring the network, since the LB function does not fit into the NVO3 architecture.
The techniques, apparatus, and solutions described herein allow the NVE to have a load balance function enabled in an NVO3 architecture. According to several of these techniques, the LB function is integrated into the NVE function. The LB function, residing in the NVE, shall be configured by the NVA over a new NVA-NVE protocol. The NVA can thus enable or disable the LB function for a given VN in a specific NVE. The NVE shall be configured with a LB address, which is either an IP address or a MAC address, for LB traffic distribution. Different LB factors, LB algorithms, etc., can be applied, based on the needs.
When the LB function is enabled or disabled in the NVE, the NVA shall update the inner-outer address mapping in the remote NVEs in order to allow the LB traffic to be sent to the LB enabled NVE. Upon VM mobility, the NVA shall disable the LB function in the old NVE and enable the LB function in the new NVE. The NVA shall also update the remote NVE to redirect LB traffic to the right NVE.
Supporting an integrated LB function in the NVO3 architecture allows the NVE to provide more flexibility when configuring a NVO3 network. This approach allows the LB function to be enabled or disabled by the NVA in an NVO3 architecture. The integrated LB function allows the NVE to handle the LB function more easily. Furthermore, when detecting a duplicated address error, the NVE will not be confused, as it has the knowledge as to why the duplicated addresses are configured, and can properly report misconfigured duplicated address errors to the NVA.
Following are specific procedures for enabling and disabling an LB function in an NVE configured according to the presently disclosed techniques.
Following are assumptions and steps for enabling the LB function in a NVO3 network that includes NVEs and an NVA configured according to the presently disclosed technique. Reference is made to
It is assumed that the Hypervisor/vSwitch is always configured by the VM Orchestration Systems. The VM Orchestration Systems configures the Hypervisor with two or more VMs with the same address. Thus, the NVA receives the VMs' configuration from VM Orchestration Systems, as shown at block 210.
As shown at block 220, the NVA configures the attached NVE with the new LB enable message via the NVA-NVE control plane protocol. In response, the NVE confirms to the NVA that the configuration is accepted and the LB function is enabled accordingly. Thus, as shown at block 230, the NVA receives, from the NVE, confirmation that the LB function is enabled. Subsequently, the NVA updates the remote NVEs to allow the LB traffic to be sent to the LB enabled NVE, as shown at block 240.
Following are assumptions and steps for disabling the LB function in a NVO3 network that includes NVEs and an NVA configured according to the presently disclosed techniques. Reference is made to
As a starting point for the method illustrated in
As shown at block 310, the NVA configures the attached NVE (the first NVE), using a new LB disable message via the NVA-NVE control plane protocol. As shown at block 320, the NVE confirms to the NVA that the indicated LB is disabled accordingly. As shown at block 330, the NVE updates the remote NVEs to disallow the LB traffic to be sent to the NVE.
Following are assumptions and steps for re-enabling the LB function at VM mobility in a NVO3 network that includes NVEs and an NVA configured according to the presently disclosed techniques. Reference is made to
As a starting point for the method illustrated in
As shown at block 410, the NVA configures the first NVE (the “old” NVE), using a new LB disable message via the NVA-NVE control plane protocol. As shown at block 420, the first NVE confirms to the NVA that the indicated LB is disabled accordingly.
As shown at block 430, the NVA configures the second NVE, i.e., the “new” NVE, with an LB enable message, via the NVA-NVE control plane protocol. The new NVE confirms to the NVA that the configuration is accepted and the LB function is enabled accordingly, as shown at block 440. The NVA updates the remote NVEs to redirect LB traffic to the new NVE, as shown at block 450.
Following are example messages that may be included in a new NVA-NVE protocol for LB enabling and disabling.
VN ID, etc. It also contains a LB ID, a LB enabling/disabling indicator, the LB address, the associated vNIC addresses for the LB function, and LB function parameters. These parameters are shown in Table 1, below:
The NVE-to-NVA confirmation message shall contain a LB enabling/disabling confirmation indicator with the associated VN name and LB ID. Alternatively, it may contain a LB enabling/disabling declare indicator with an error code. These parameters are shown in Table 2, below:
As shown at block 510, the illustrated method begins with the receiving, in the NVE, of an LB enable message from the NVA. A confirmation message is then sent to the NVA, as shown at block 520.
When Layer 3 service is supported in the NVE, the LB address (included in the LB enable message, in some embodiments) will be an IP address. This is the destination IP address to which the incoming traffic shall be sent. When the incoming packets with that LB IP address are received, as shown at block 530, the NVE uses the LB IP address to find out the VN context, as shown at block 540. The NVE then applies an LB algorithm based on certain LB factors, as shown at block 550. For instance, the LB factors may specify whether the LB algorithm uses the source IP address. The LB algorithm and/or the LB factors may be specified in the LB enable message, for example.
The next step is based on the output of the LB algorithm. As shown at block 560, the NVE obtains the VM MAC address where the packets shall be forwarded. The VM MAC address is configured by the NVA as the associated vNIC addresses, e.g., in the LB enable message. The last step, as shown at block 570, is to perform L2 forwarding with the VM address as the destination MAC address of the L2 packet.
The method shown in
Then, the NVE shall apply the specified LB algorithm based on the specified LB factor, as shown at block 550. For instance, the LB algorithm may use the last digit of the user ID. In that case, the NVE shall open the packet until the Layer 4, in order to perform the LB policies.
The next step is based on the output of the LB algorithm. As shown at block 560, the NVE obtains the VM address where the packets shall be forwarded. The VM address is configured by the NVA as the associated vNIC addresses. Before forwarding the packets to the VM, as shown at block 570, the destination address of the L2 packet header shall be replaced with the VM address.
The various techniques and processes described above are implemented in NVEs and/or NVAs, or in their equivalents in other network virtualization overlays. It will be appreciated that NVEs and NVAs are logical entities, which may be implemented on one or more processors in one or more physical devices.
A computer program for controlling the node 1 to carry out a method embodying any of the presently disclosed techniques is stored in a program storage 30, which comprises one or several memory devices. Data used during the performance of a method embodying the present invention is stored in a data storage 20, which also comprises one or more memory devices. During performance of a method embodying the present invention, program steps are fetched from the program storage 30 and executed by a Central Processing Unit (CPU) 10, retrieving data as required from the data storage 20. Output information resulting from performance of a method embodying the present invention can be stored back in the data storage 20, or sent to an Input/Output (I/O) interface 40, which includes a network interface for sending and receiving data to and from other network nodes. The CPU 10 and its associated data storage 20 and program storage 20 may collectively be referred to as a “processing circuit.” It will be appreciated that variations of this processing circuit are possible, including circuits include one or more of various types of programmable circuit elements, e.g., microprocessors, microcontrollers, digital signal processors, field-programmable application-specific integrated circuits, and the like, as well as processing circuits where all or part of the processing functionality described herein is performed using dedicated digital logic.
Accordingly, in various embodiments of the invention, processing circuits, such as the CPU 10, data storage 20, and program storage 30 in
Various aspects of the above-described embodiments can also be understood as being carried out by functional “modules,” or “units,” which may be program instructions executing on an appropriate processor circuit, hard-coded digital circuitry and/or analog circuitry, or appropriate combinations thereof. Thus, for example, an example NVA node adapted to provide reachability and forwarding information to one or more NVE nodes in a network employing a NVO, wherein each NVE node implements Layer 2 and/or Layer 3 network virtualization functions for one or more tenant system elements, may comprise functional modules corresponding to the methods and functionality described above, including a receiving unit for receiving VM configuration information for one or more VMs, via a network interface circuit, a configuring unit for configuring at least a first NVE to enable load balancing by sending a LB enable message to the first NVE node via the network interface circuit, and an updating unit for updating configuration information for one or more remote NVEs to allow load balancing traffic for the one or more VMs to be sent to the first NVE node.
Similarly, an example NVE node may be understood to comprise a receiving unit for receiving, via the network interface circuit, a LB enable message from a NVA node that provides reachability and forwarding information to the NVE node, an enabling unit for enabling a load balancing function, in response to the LB enable message; and a forwarding unit for forwarding subsequent load balancing traffic to one or more VMs, using the enabled load balancing function.
Examples of several embodiments of the present techniques have been described in detail above, with reference to the attached illustrations of specific embodiments. Because it is not possible, of course, to describe every conceivable combination of components or techniques, those skilled in the art will appreciate that the present invention can be implemented in other ways than those specifically set forth herein, without departing from essential characteristics of the invention. The present embodiments are thus to be considered in all respects as illustrative and not restrictive.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/IB2014/065830 | 11/5/2014 | WO | 00 |
Number | Date | Country | |
---|---|---|---|
61900732 | Nov 2013 | US |