Benefit is claimed under 35 U.S.C. 119 (a)-(d) to Foreign Application Serial No. 202341037603 filed in India entitled “ADAPTIVE TRAFFIC FORWARDING OVER MULTIPLE CONNECTIVITY SERVICES”, on May 31, 2023, by VMware, Inc., which is herein incorporated in its entirety by reference for all purposes.
Virtualization allows the abstraction and pooling of hardware resources to support virtual machines in a software-defined data center (SDDC). For example, through server virtualization, virtualization computing instances such as virtual machines (VMs) running different operating systems may be supported by the same physical machine (e.g., referred to as a “host”). Each VM is generally provisioned with virtual resources to run a guest operating system and applications. The virtual resources may include central processing unit (CPU) resources, memory resources, storage resources, network resources, etc. In practice, a user (e.g., organization) may run VMs using on-premises data center infrastructure that is under the user's private ownership and control. Additionally, the user may run VMs in the cloud using infrastructure under the ownership and control of a public cloud provider. It is desirable to improve the performance of traffic forwarding among VMs deployed in different cloud environments.
In the following detailed description, reference is made to the accompanying drawings, which form a part hereof. In the drawings, similar symbols typically identify similar components, unless context dictates otherwise. The illustrative embodiments described in the detailed description, drawings, and claims are not meant to be limiting. Other embodiments may be utilized, and other changes may be made, without departing from the spirit or scope of the subject matter presented here. It will be readily understood that the aspects of the present disclosure, as generally described herein, and illustrated in the drawings, can be arranged, substituted, combined, and designed in a wide variety of different configurations, all of which are explicitly contemplated herein. Although the terms “first” and “second” are used to describe various elements, these elements should not be limited by these terms. These terms are used to distinguish one element from another. For example, a first element may be referred to as a second element, and vice versa.
In the example in
In practice, a public cloud provider is generally an entity that offers a cloud-based platform to multiple users or tenants. This way, a user may take advantage of the scalability and flexibility provided by public cloud environment 101 for data center capacity extension, disaster recovery, etc. Throughout the present disclosure, public cloud environment 102 will be exemplified using VMware Cloud™ (VMC) on Amazon Web Services® (AWS) and Amazon Virtual Private Clouds (VPCs). Amazon VPC and Amazon AWS are registered trademarks of Amazon Technologies, Inc. It should be understood that any additional and/or additional cloud technology may be implemented, such as Microsoft Azure®, Google Cloud Platform™, IBM Cloud™, etc.
To facilitate cross-cloud traffic forwarding, a pair of edge devices may be deployed at the respective first site and second site. In particular, a first computer system capable of acting as EDGE1 110 (“first edge device”) may be deployed at the edge of public cloud environment 101 to handle traffic to/from private cloud environment 102. A second computer system capable of acting as EDGE2 120 (“second edge device”) may be deployed at the edge of private cloud environment 102 to handle traffic to/from public cloud environment 101. Here, the term “network edge,” “edge gateway,” “edge node” or simply “edge” may refer generally to any suitable computer system that is capable of performing functionalities of a gateway, switch, router (e.g., logical service router), bridge, edge appliance, or any combination thereof.
EDGE 110/120 may be implemented using one or more virtual machines (VMs) and/or physical machines (also known as “bare metal machines”). Each EDGE node may implement a logical service router (SR) to provide networking services, such as gateway service, domain name system (DNS) forwarding, IP address assignment using dynamic host configuration protocol (DHCP), source network address translation (SNAT), destination NAT (DNAT), deep packet inspection, etc. When acting as a gateway, an EDGE node may be considered to be an exit point to an external network.
Referring to public cloud environment 101 in
Referring to private cloud environment 102 in
In the example in
A second connectivity service (denoted as SERVICE2 142) may be a route-based virtual private network (VPN) or RBVPN, which involves establishing an Internet Protocol Security (IPSec) tunnel for forwarding traffic between public cloud environment 101 and private cloud environment 102. Since the VPN service generally relies on public network infrastructure, its bandwidth and latency may fluctuate. Any suitable protocol may be implemented to discover and propagate routes as networks are added and removed, such as border gateway protocol (BGP), etc.
Referring to public cloud environment 101, all north-south traffic may be forwarded or steered via EDGE1 110. In practice, consider a scenario where SERVICE1 141 (e.g., AWS DX) has been configured as a primary service, and SERVICE2 142 (e.g., VPN) as a backup or secondary service that is only active in the event of a failure associated with SERVICE1 141. In this case, for cross-cloud traffic, EDGE1 110 may forward all traffic flows towards EDGE2 120 using SERVICE1 141 (e.g., a single 1 Gbps link or 2 Gbps link). Conventionally, once SERVICE1 141 becomes saturated and/or approaches its bandwidth limit, EDGE1 110 is unable to take advantage of the available bandwidth provided by SERVICE2 142 due to protocol limitations. This may affect the performance of various cross-cloud traffic flows, which is undesirable.
According to examples of the present disclosure, adaptive traffic forwarding may be implemented based on metric information to distribute traffic over multiple connectivity services. Using examples of the present disclosure, the bandwidth of one service (e.g., SERVICE1 141) may be scaled UP using available bandwidth of at least one other service (e.g., SERVICE2 142), thereby reducing the likelihood of performance degradation due to high volume of traffic over one service (e.g., SERVICE1 141). It should be understood that examples of the present disclosure may be implemented by EDGE1 110 and/or EDGE2 120 to facilitate intelligent traffic routing to improve the performance of cross-cloud traffic forwarding.
In more detail,
At 210 in
At 220-230 in
The subset selection at block 230 may be performed based on any suitable policy, which may be a user-configurable policy (e.g., configured by a network administrator) and/or default policy. For example, selected subset 160 may include a first flow (denoted as F1) and a second flow (F2) but exclude a third flow (F3). In this case, the policy may specify a whitelist of application segment(s) or traffic type(s) movable from one service to another service. The policy may also specify a blacklist of application segment(s) or traffic type(s) that should not be moved from one service to another. The whitelist and/or blacklist may be updated by the user from time to time. If no policy is configured by the user, a default policy may be implemented to select subset 160 based on an amount of available bandwidth associated with SERVICE2 142 and an amount of bandwidth required by F1 or F2. See 151-152 and 160 in
At 240 in
At 250-260 in
Using examples of the present disclosure, traffic may be distributed over multiple (N) connectivity services in a more adaptive manner based on metric information that is monitored in real time. In practice, it should be understood that N>2 services may be configured and one service (denoted as SERVICEi) may be scaled UP using any other service (SERVICEj) where i,j∈[1, . . . , N] and j≠i. Various examples will be discussed using
Referring first to
At 410 in
At 430 in
Similarly, in response to detecting egress packets that are destined first network=192.168.12.0/24, EDGE2 120 may forward the egress packets using SERVICE1 141 towards EDGE1 110 based on second routing information 420. One example may be egress packets (i.e., egress from the perspective of EDGE2 120) that are associated with a third flow (F2) from source VM6 136 (i.e., IP6=10.10.10.6) to destination VM3 133 (i.e., IP3=192.168.12.3). See 433 in
At 440-450 in
For example, EDGE1 110 may obtain metric information 450 associated with SERVICE1 141 and/or SERVICE2 142 using any suitable application programming interface (API) and/or command line interface (CLI) supported by analytics system 401, etc. Example metric information (METRIC1) 451 associated with SERVICE1 141 (e.g., DX) may include throughput, cumulative bandwidth, connection state (e.g., UP or DOWN), bitrate for egress/ingress data, packet rate for egress/ingress data, error count, connection light level indicating the health of fiber connection, encryption state, etc.
Example metric information (METRIC2) 452 associated with SERVICE2 142 (e.g., VPN tunnel) may include VPN tunnel state, bytes received on public cloud environment's 101 side of the connection through the VPN tunnel, bytes sent from the public cloud environment's 101 side of the connection through the VPN tunnel, etc. In practice, the event that a VPN tunnel terminates on EDGE1 110 itself, EDGE1 110 may monitor metric information associated with the VPN tunnel directly.
Example metric information associated with set of multiple flows 430 may include average or maximum round trip time, total number of bytes sent by the destination of a flow, total number of packets exchanged between the source and the destination of a flow, packet loss, retransmitted packet ratio, total number of bytes sent by the source of a flow, ratio of retransmitted packets to the number of transmitted Transmission Control Protocol (TCP) packets, traffic rate, etc. Additionally, workload traffic patterns during peak or non-peak office hours may be observed and learned.
At 510 in
Alternatively or additionally, a default policy or algorithm may be applied to select subset 510 based on the amount of available bandwidth associated with SERVICE2 142 and the amount of bandwidth required by each flow. In the example in
At 520 in
At 530 in
Using examples of the present disclosure, EDGE1 110 may install adaptive static routes 521-522 to steer flows from one service to another. Further, routes 541-542 may be intelligently programmed using adaptive route advertisements between EDGE1 110 and EDGE2 120. Depending on the desired implementation, firewall state synchronization may be implemented across interfaces for SERVICE1 141 (e.g., DX) and SERVICE2 142 (e.g., VPN) at both cloud environments 101-102. This is to maintain firewall state awareness across interfaces/services for a particular flow so that asymmetric traffic is not dropped.
Based on updated routing information 520 (particularly 521-522), EDGE1 110 may forward egress packets destined for IP4=10.10.10.4 or IP5=10.10.10.5 towards EDGE2 120 using SERVICE2 142. Similarly, based on updated routing information 540 (particularly 541-542), EDGE2 120 may forward egress packets destined for IP1=192.168.12.1 or IP2=192.168.12.2 towards EDGE1 110 using SERVICE2 142. See also 370 in
For a third flow (F3) that is not selected to be part of subset 510, however, EDGE1 110 may continue using SERVICE1 141 to forward egress packets destined for IP6=10.10.10.6 towards EDGE2 120 based on the existing routing entry for destination network=10.10.10.0/24 (see 411 and 433 in
Depending on the desired implementation, an adaptive static route installed by EDGE1 110 may specify a classless inter-domain routing (CIDR) block, instead of a particular destination IP address shown in
One example condition for scaling DOWN may be an observation that the cumulative bandwidth or throughput (e.g., moving average) associated with SERVICE1 141 is lower than a threshold value for a threshold period of time. Another example condition is the total traffic (i.e., over both SERVICE1 141 and SERVICE2 142) is lower than a threshold amount of traffic that can be supported by SERVICE1 141 for a threshold amount of time.
At 630 in
At 640 in
Based on updated routing information 630, EDGE1 110 may forward egress packets destined for 10.10.10.0/24 (i.e., including IP4=10.10.10.4 and IP5=10.10.10.5) towards EDGE2 120 using SERVICE1 141. Based on updated routing information 650, EDGE2 120 may forward egress packets destined for 192.168.12.0/24 (i.e., including IP1=192.168.12.1 and IP2=192.168.12.2) towards EDGE1 110 using SERVICE1 141. See block 395 in
Throughout the present disclosure, various examples will be explained using SERVICE1 141 as a primary service, and SERVICE2 142 as a secondary or backup service. In practice, the reverse may also be configured, i.e., SERVICE2 142 (e.g., VPN) as primary and SERVICE1 141 (e.g., DX) as backup. For example, consider a scenario where the bandwidth available for SERVICE2 142 is 5 Gbps, while SERVICE1 141 includes two pipes with a total of 2 Gbps. Until traffic is up to 5 Gbps, SERVICE2 142 may take priority.
In response to determination that a condition for scaling UP is satisfied, EDGE1 110 may perform blocks 310-370 in
Alternatively or additionally, it should be understood that EDGE2 120 may perform the example in
In another example, asymmetric distribution of traffic over multiple dedicated links (e.g., DX links) may be implemented based on different available link bandwidths or configurations, such as a first link providing 10 Gbps (“SERVICE1”) and a second link providing 1 Gbps (“SERVICE2”). In this case, the 10 Gbps link may be configured as a primary link, and the 1 Gbps as secondary link. In response to determination that a scaling UP condition is satisfied, a subset of flow(s) may be selected and steered from the primary link to the secondary link according to examples of the present disclosure. Any additional and/or alternative connectivity services may be implemented to facilitate cross-cloud traffic forwarding, such as Microsoft Azure® ExpressRoute, Google® Cloud Interconnect, etc.
According to at least one embodiment, a management entity may be deployed to instruct EDGE1 110 and/or EDGE2 120 to perform adaptive traffic forwarding. Some examples will be described using
Depending on the desired implementation, management entity 701 (e.g., central manager) may be implemented using any suitable third computer system that is capable of a multi-cloud environment that includes first cloud environment 101 and second cloud environment 102. For example, management entity 701 may have access to configuration information associated with both cloud environments 101-102, as well as metric information associated with multiple connectivity services connecting them. In the following, various implementation details explained using
At 710-715 in
At 720 in
At 725-730 in
At 740 in
At 750 in
At 760-765 in
At 770 in
At 790 in
In the example in
Hypervisor 814A/814B maintains a mapping between underlying hardware 812A/812B and virtual resources allocated to respective VMs. Virtual resources are allocated to respective VMs 131-133, 837 to support a guest operating system (OS; not shown for simplicity) and application(s); see 841-844, 851-854. For example, the virtual resources may include virtual CPU, guest physical memory, virtual disk, virtual network interface controller (VNIC), etc. Hardware resources may be emulated using virtual machine monitors (VMMs). For example in
Although examples of the present disclosure refer to VMs, it should be understood that a “virtual machine” running on a host is merely one example of a “virtualized computing instance” or “workload.” A virtualized computing instance may represent an addressable data compute node (DCN) or isolated user space instance. In practice, any suitable technology may be used to provide isolated user space instances, not just hardware virtualization. Other virtualized computing instances may include containers (e.g., running within a VM or on top of a host operating system without the need for a hypervisor or separate operating system or implemented as an operating system level virtualization), virtual private servers, client computers, etc. Such container technology is available from, among others, Docker, Inc. The VMs may also be complete computational environments, containing virtual equivalents of the hardware and software components of a physical computing system.
The term “hypervisor” may refer generally to a software layer or component that supports the execution of multiple virtualized computing instances, including system-level software in guest VMs that supports namespace containers such as Docker, etc. Hypervisors 814A-B may each implement any suitable virtualization technology, such as VMware ESX® or ESXi™ (available from VMware, Inc.), Kernel-based Virtual Machine (KVM), etc. The term “packet” may refer generally to a group of bits that can be transported together, and may be in another form, such as “frame,” “message,” “segment,” etc. The term “traffic” or “flow” may refer generally to multiple packets. The term “layer-2” may refer generally to a link layer or media access control (MAC) layer; “layer-3” a network or IP layer; and “layer-4” a transport layer (e.g., using Transmission Control Protocol (TCP), User Datagram Protocol (UDP), etc.), in the Open System Interconnection (OSI) model, although the concepts described herein may be used with other networking models.
SDN controller 870 and SDN manager 880 are example network management entities. One example of an SDN controller is the NSX controller component of VMware NSX® (available from VMware, Inc.) that operates on a central control plane. SDN controller 870 may be a member of a controller cluster (not shown for simplicity) that is configurable using SDN manager 880. Network management entity 870/880 may be implemented using physical machine(s), VM(s), or both. To send or receive control information, a local control plane (LCP) agent (not shown) on host 810A/810B may interact with SDN controller 870 via a control-plane channel.
Through virtualization of networking services in SDN environment 100, logical networks (also referred to as overlay networks or logical overlay networks) may be provisioned, changed, stored, deleted and restored programmatically without having to reconfigure the underlying physical hardware architecture. Hypervisor 814A/814B implements virtual switch 815A/815B and logical distributed router (DR) instance 817A/817B to handle egress packets from, and ingress packets to, VMs 131-133, 837. In SDN environment 100, logical switches and logical DRs may be implemented in a distributed manner and can span multiple hosts.
For example, a logical switch (LS) may be deployed to provide logical layer-8 connectivity (i.e., an overlay network) to VMs 131-133, 837. A logical switch may be implemented collectively by virtual switches 815A-B and represented internally using forwarding tables 816A-B at respective virtual switches 815A-B. Forwarding tables 816A-B may each include entries that collectively implement the respective logical switches. Further, logical DRs that provide logical layer-3 connectivity may be implemented collectively by DR instances 817A-B and represented internally using routing tables (not shown) at respective DR instances 817A-B. Each routing table may include entries that collectively implement the respective logical DRs.
Packets may be received from, or sent to, each VM via an associated logical port. For example, logical switch ports 865-868 (labelled “LSP1” to “LSP4”) are associated with respective VMs 131-133, 837. Here, the term “logical port” or “logical switch port” may refer generally to a port on a logical switch to which a virtualized computing instance is connected. A “logical switch” may refer generally to a software-defined networking (SDN) construct that is collectively implemented by virtual switches 815A-B, whereas a “virtual switch” may refer generally to a software switch or software implementation of a physical switch. In practice, there is usually a one-to-one mapping between a logical port on a logical switch and a virtual port on virtual switch 815A/815B. However, the mapping may change in some scenarios, such as when the logical port is mapped to a different virtual port on a different virtual switch after migration of the corresponding virtualized computing instance (e.g., when the source host and destination host do not have a distributed virtual switch spanning them).
A logical overlay network may be formed using any suitable tunneling protocol, such as Virtual extensible Local Area Network (VXLAN), Stateless Transport Tunneling (STT), Generic Network Virtualization Encapsulation (GENEVE), Generic Routing Encapsulation (GRE), etc. For example, VXLAN is a layer-8 overlay scheme on a layer-3 network that uses tunnel encapsulation to extend layer-8 segments across multiple hosts which may reside on different layer 8 physical networks. Hypervisor 814A/814B may implement virtual tunnel endpoint (VTEP) 819A/819B to encapsulate and decapsulate packets with an outer header (also known as a tunnel header) identifying the relevant logical overlay network (e.g., VNI). Hosts 810A-B may maintain data-plane connectivity with each other via physical network 805 to facilitate east-west communication among VMs 131-133, 837. Hosts 810A-B may also maintain data-plane connectivity with EDGE1 110 in
Although discussed using VMs 131-136, it should be understood that adaptive traffic forwarding may be performed for other virtualized computing instances, such as containers, etc. The term “container” (also known as “container instance”) is used generally to describe an application that is encapsulated with all its dependencies (e.g., binaries, libraries, etc.). For example, multiple containers may be executed as isolated processes inside VM1 131, where a different VNIC is configured for each container. Each container is “OS-less”, meaning that it does not include any OS that could weigh 10s of Gigabytes (GB). This makes containers more lightweight, portable, efficient and suitable for delivery into an isolated OS environment. Running containers inside a VM (known as “containers-on-virtual-machine” approach) not only leverages the benefits of container technologies but also that of virtualization technologies.
The above examples can be implemented by hardware (including hardware logic circuitry), software or firmware or a combination thereof. The above examples may be implemented by any suitable computing device, computer system, etc. The computer system may include processor(s), memory unit(s) and physical NIC(s) that may communicate with each other via a communication bus, etc. The computer system may include a non-transitory computer-readable medium having stored thereon instructions or program code that, when executed by the processor, cause the processor to perform processes described herein with reference to
The techniques introduced above can be implemented in special-purpose hardwired circuitry, in software and/or firmware in conjunction with programmable circuitry, or in a combination thereof. Special-purpose hardwired circuitry may be in the form of, for example, one or more application-specific integrated circuits (ASICs), programmable logic devices (PLDs), field-programmable gate arrays (FPGAs), and others. The term ‘processor’ is to be interpreted broadly to include a processing unit, ASIC, logic unit, or programmable gate array etc.
The foregoing detailed description has set forth various embodiments of the devices and/or processes via the use of block diagrams, flowcharts, and/or examples. Insofar as such block diagrams, flowcharts, and/or examples contain one or more functions and/or operations, it will be understood by those within the art that each function and/or operation within such block diagrams, flowcharts, or examples can be implemented, individually and/or collectively, by a wide range of hardware, software, firmware, or any combination thereof.
Those skilled in the art will recognize that some aspects of the embodiments disclosed herein, in whole or in part, can be equivalently implemented in integrated circuits, as one or more computer programs running on one or more computers (e.g., as one or more programs running on one or more computing systems), as one or more programs running on one or more processors (e.g., as one or more programs running on one or more microprocessors), as firmware, or as virtually any combination thereof, and that designing the circuitry and/or writing the code for the software and or firmware would be well within the skill of one of skill in the art in light of this disclosure.
Software and/or to implement the techniques introduced here may be stored on a non-transitory computer-readable storage medium and may be executed by one or more general-purpose or special-purpose programmable microprocessors. A “computer-readable storage medium”, as the term is used herein, includes any mechanism that provides (i.e., stores and/or transmits) information in a form accessible by a machine (e.g., a computer, network device, personal digital assistant (PDA), mobile device, manufacturing tool, any device with a set of one or more processors, etc.). A computer-readable storage medium may include recordable/non recordable media (e.g., read-only memory (ROM), random access memory (RAM), magnetic disk or optical storage media, flash memory devices, etc.).
The drawings are only illustrations of an example, wherein the units or procedure shown in the drawings are not necessarily essential for implementing the present disclosure. Those skilled in the art will understand that the units in the device in the examples can be arranged in the device in the examples as described or can be alternatively located in one or more devices different from that in the examples. The units in the examples described can be combined into one module or further divided into a plurality of sub-units.
Number | Date | Country | Kind |
---|---|---|---|
202341037603 | May 2023 | IN | national |