Virtualization allows the abstraction and pooling of hardware resources to support virtual machines in a virtualized computing environment, such as a software-defined data center (SDDC). For example, through server virtualization, virtual machines running different operating systems may be supported by the same physical machine (also referred to as a “host”). Each virtual machine is generally provisioned with virtual resources to run an operating system and applications. The virtual resources may include central processing unit (CPU) resources, memory resources, storage resources, network resources, etc.
Address resolution refers to the process of resolving a protocol address (e.g., Internet Protocol (IP) address) to a hardware address (e.g., Media Access Control (MAC) address). For example, address resolution may be required when a source wishes to communicate with a destination. To learn the hardware address of the destination, the source broadcasts a request message that includes a known protocol address of the destination. In response, the destination will send a response message that includes its hardware address. Other recipients are not required to respond to the broadcasted request message. In practice, address resolution may be handled more efficiently, especially in extended logical layer-2 networks.
In the following detailed description, reference is made to the accompanying drawings, which form a part hereof. In the drawings, similar symbols typically identify similar components, unless context dictates otherwise. The illustrative embodiments described in the detailed description, drawings, and claims are not meant to be limiting. Other embodiments may be utilized, and other changes may be made, without departing from the spirit or scope of the subject matter presented here. It will be readily understood that the aspects of the present disclosure, as generally described herein, and illustrated in the drawings, can be arranged, substituted, combined, and designed in a wide variety of different configurations, all of which are explicitly contemplated herein.
Challenges relating to address resolution will now be explained using
Referring first to
Referring to a more detailed view in
Hypervisor 214A/214B further implements virtual switch 215A/215B and DR instance 217A/217B to handle egress packets from, and ingress packets to, corresponding VMs 131-134. Packets may be received from, or sent to, each VM via an associated logical port. For example, logical ports 271-274 are associated with respective VMs 131-134. Here, the term “logical port” may refer generally to a port on a logical switch to which a virtualized computing instance is connected. A “logical switch” may refer generally to a software-defined networking (SDN) construct that is collectively implemented by virtual switches 215A-B in the example in
Although examples of the present disclosure refer to virtual machines, it should be understood that a “virtual machine” running on a host is merely one example of a “virtualized computing instance.” or “workload.” A virtualized computing instance may represent an addressable data compute node or isolated user space instance. In practice, any suitable technology may be used to provide isolated user space instances, not just hardware virtualization. Other virtualized computing instances may include containers (e.g., running within a VM or on top of a host operating system without the need for a hypervisor or separate operating system or implemented as an operating system level virtualization), virtual private servers, client computers, etc. Such container technology is available from, among others, Docker, Inc. The VMs may also be complete computational environments, containing virtual equivalents of the hardware and software components of a physical computing system. The term “hypervisor” may refer generally to a software layer or component that supports the execution of multiple virtualized computing instances, including system-level software in guest VMs that supports namespace containers such as Docker, etc.
Further in
Through virtualization of networking services in SDN environment 100, logical networks (also referred to as overlay networks or logical overlay networks) may be provisioned, changed, stored, deleted and restored programmatically without having to reconfigure the underlying physical hardware architecture. A logical network may be formed using any suitable tunneling protocol, such as Generic Routing Encapsulation (GRE), Internet Protocol Security (IPSec), Virtual eXtensible Local Area Network (VXLAN), Stateless Transport Tunneling (STT), Virtual Local Area Network (VLAN), Generic Network Virtualization Encapsulation (GENEVE), Network Virtualization using Generic Routing Encapsulation (NVGRE), Layer 2 Tunneling Protocol (L2TP), any combination thereof, etc. For example, VXLAN is a layer-2 overlay scheme on a layer-3 network that uses tunnel encapsulation to extend layer-2 segments across multiple hosts which may reside on different layer 2 physical networks. In the example in
Logical switches and logical distributed routers may be implemented in a distributed manner and can span multiple hosts and edge 120. For example, logical switches that provide logical layer-2 connectivity may be implemented collectively by virtual switches 215A-B and represented internally using forwarding tables 216A-B at respective virtual switches 215A-B. Forwarding tables 216A-B may each include entries that collectively implement the respective logical switches. Further, logical distributed routers that provide logical layer-3 connectivity may be implemented collectively by DR modules 217A-B and represented internally using routing tables 218A-B at respective DR modules 217A-B. Routing tables 218A-B may be each include entries that collectively implement the respective logical distributed routers.
Referring now to
In the example in
Using ARP as an example, VM1131 may broadcast an ARP request within logical network with VNI=100 to translate IP address=IP-VM2 of VM2132 to its corresponding MAC address. Each recipient will examine whether its IP address matches with that in the ARP request. Since its IP address=IP-VM2, VM2132 will respond with an ARP response with MAC address=MAC-VM2. The ARP response is a unicast message that is only sent to VM1131. VM1131 caches protocol-to-hardware address mapping information (IP-VM2, MAC-VM2) in an ARP table entry, which expires if VM1131 does not communicate with VM2132 within a predefined period of time. After the ARP table entry expires, VM1131 will have to repeat the above process to relearn the MAC address of VM2132. The address resolution process may be repeated by other virtual machines in a similar manner.
Address Resolution Handling at Logical Dr Ports
According to examples of the present disclosure, address resolution handling may be performed in SDN environment 100 where endpoints (e.g., VMs 131-132 and physical server 102) are interconnected through network extension (see 103) supported by edge 120. As used herein, the term “network extension” may refer generally to any suitable network configuration that extends or stretches a logical layer-2 network (and corresponding broadcast domain) across multiple geographical sites. In practice, any suitable network extension may be used, such as layer-2 network bridging (e.g., VNI-VLAN bridging), layer-2 virtual private network (L2VPN), etc. The term “network extension” is sometimes referred to as data center interconnect (DCI), data center extension (DCE), stretched layer-2 network, extended layer-2 network, stretched deploy, etc.
For example in
To facilitate communication between VM1131 and physical server 102 located on different subnets, a protocol address (e.g., IP-S) of physical server 102 needs to be resolved into a hardware address (e.g., MAC-S). An example will be explained using
As used herein, the term “network device” (e.g., edge 120; also referred to as “computer system” or “appliance”) may refer generally to an entity that is capable of performing functionalities of a switch, router (e.g., logical service router), bridge, gateway, edge appliance, or any combination thereof. It should be understood that edge 120 may be implemented using one or more virtual machines (VMs) and/or physical machines (also known as “bare metal machines”). The term “DR instance” may refer generally to one of multiple routing components of a logical DR. The multiple routing components are usually distributed across respective multiple entities (e.g., hosts 110A-B and edge 120). The term logical “DR port” or “logical DR port” may refer generally to a logical interface of a DR instance. Each DR port usually connects to a particular network segment (e.g., VNI=100 for “p1” and VNI=200 for “p2”).
At 310 in
At 320-330 in
At 340 in
At 350-360 in
According to examples of the present disclosure, first DR port 151 of DR1141 and second DR port 152 of DR2142 may each act as proxy (e.g., ARP proxy) to facilitate address resolution through network extension supported by edge 120. In the example in
For example, if server 102 receives “REQUEST1” 171 from host-A 110A without any modification, it will learn (IP-DR-p2, MAC-A) associated with DR port 152 of DR2142. However, if server 102 receives a subsequent request from host-B 110B, it will relearn (IP-DR-p2, MAC-B) associated with DR port 153 of DR3143. This process of MAC learning and relearning is inefficient, and may be exacerbated in SDN environments with a large number of hosts. Using examples of the present disclosure, address resolution handling may be improved and performed in a more efficient manner in SDN environment 100.
Depending on the desired implementation, MAC-C (“first address”) may be a virtual MAC address associated with first DR port 151 of DR1141. Here, the term “virtual MAC address” (e.g., MAC-C) may refer to a MAC address assigned to a logical element (e.g., first DR port 151 of DR1141), Further, MAC-A (“second address”) may be a physical MAC address associated with second DR port 152, or more particularly a physical address assigned to hypervisor 114A supporting DR2142 on host-A 110A. The term “physical MAC address” may be a MAC address assigned to a physical entity (e.g., host-A 110A) for communication with another entity (e.g., edge 120).
As will be exemplified using
In the case of multi-site network extension in
Layer-2 Network Bridging
A first example relating to layer-2 bridging will be discussed with reference to
In the example in
(a) Address Resolution Request
In the example in
In response to detecting ping packet 510 via DR port “p1” 161, DR2142 on host-A 110A may generate and broadcast an ARP request (see “Q2” 520 in
Referring also to
At 415 and 420 in
At 430 in
In practice, ARP requests and responses may include other fields that are not shown in
(b) Address Resolution Response
In response to detecting ARP request 530, physical server 102 on VLAN 10 may determine that TPA=IP-S matches with its IP address. As such, physical server 102 with MAC address=MAC-S responds with ARP response 540 (labelled “Q4”) specifying SHA=MAC-S, SPA=IP-S, THA=MAC-C (i.e., SHA in ARP request 530), TPA=IP-DR-p2 (i.e., SPA in ARP request 530).
At 440 in
At 450 and 455 in
(c) Address Resolution Suppression
By dynamically learning protocol-to-hardware address mapping information, subsequent ARP requests to resolve the same IP address may be suppressed to reduce the amount of broadcast traffic. In the example in
In response to DR port “p2” 152 of DR1141 detecting ARP request 620, edge 120 determines that SPA=IP-DR-p2 is an IP address associated with DR port “p2” 151 at DR1141. Edge 120 then performs a lookup to find TPA=IP-S in DR's neighbor cache 550. As such, edge 120 may suppress ARP request 620 (see 630 in
Multi-Site Network Extension
A second example relating to multi-site network extension will be discussed with reference to
Referring first to
In practice, hosts 110A-C may be located at geographically-dispersed sites, such hosts 110A-B at a first site and host-C 110C at a second site. To facilitate communication between hosts 110A-C, first edge 120 may be deployed at the edge of the first site, and second edge 701 at the edge of the second site. First edge 120 and second edge 701 may communicate via any suitable tunnel, such as L2VPN tunnel 703. In practice, edge 120/701 may be any suitable network device that is implemented using one or more virtual machines (VMs) and/or physical machines (also known as “bare metal machines”) capable of performing functionalities of a switch, router, bridge, gateway, any combination thereof, etc. Through edge 120/701, an extended logical network with VNI=200 may be stretched across multiple sites.
(a) Address Resolution Request
In the example in
At host-A 110A, in response to detecting ping packet 710 via DR port “p1” 162 of DR2142 on host-A 110A, ARP request 720 (labelled “Q2”) is generated and broadcasted in logical network with VNI=200. ARP request 720 specifies source information (SHA=MAC-A, SPA=IP-DR-p2) associated with of DR port “p2” 152 at DR2142, and (THA=FF:FF:FF:FF:FF:FF, TPA=IP-VM5). Note that SHA=MAC-A may represent a physical MAC address associated with hypervisor 114A that implements DR2142 on host-A 110A. ARP request 720 is then broadcasted through overlay tunnels to all transport nodes (e.g., hosts) connected to VNI=200.
At first edge 120, in response to DR port “p2” 151 of DR1141 detecting ARP request 720, it is observed that SPA=IP-DR-p2 is an IP address associated with DR port “p2” 151. As such, first edge 120 acts as an ARP proxy to intercept ARP request 720. Modified ARP request 730 (labelled “Q3”) specifying (SHA=MAC-C, SPA=IP-DR-p2) associated with DR port “p2” 151 at DR1141 is then generated and broadcasted. This involves first edge 120 sending ARP request 730 to second edge 701 at the second site over L2VPN tunnel 703. Pending ARP request 720 will also be recorded. Note that SHA=MAC-C may be a physical MAC address associated with DR port “p2” 151 of DR1141. See 405-435 in
At second edge 701, ARP request 730 that is injected into L2VPN tunnel 703 at the first site is processed accordingly. In this example, second edge 701 acting as an ARP proxy may intercept modified ARP request 730 (labelled “Q3”) and observe that SPA=IP-DR-p2 is an IP address associated with DR port “p2” 154. As such, second edge 701 may further modify the SHA field from MAC-C (i.e., physical MAC address of DR port “p2” 151 at first edge 120) to MAC-X (i.e., physical MAC address of DR port “p2” 154 at second edge 701). Modified ARP request 740 (labelled “Q4”) specifying SHA=MAC-X is then broadcasted.
At host-C 110C, modified ARP request 740 (labelled “Q4”) is further updated from SHA=MAC-X (i.e., physical MAC address of DR port “p2” 154 at second edge 701) to SHA=MAC-Y (i.e., virtual MAC address of DR port “p2” 155 at host-C 110C). The resulting modified ARP request 750 (labelled “Q5”) specifying SHA=MAC-Y is then forwarded towards VM5135. This way, VM5135 will only see virtual MAC address=MAC-Y associated with DR port “p2” 155 at host-C 110C (instead of MAC-X, MAC-C and MAC-A, which may be confusing).
(b) Address Resolution Response
At VM5135, in response to detecting modified ARP request (labelled “Q5”) 750, physical server 102 on VLAN 10 may determine that TPA=IP-VM5 matches with its IP address. As such, VM5135 responds with ARP response 760 (labelled “Q6”) specifying SHA=MAC-VM5, SPA=IP-VM5, THA=MAC-Y (i.e., SHA in “Q5” 750), TPA=IP-DR-p2 (i.e., SPA in “Q5” 750). At DR 145, in response to DR port “p2” 155 detecting ARP response 760, THA=MAC-Y is replaced with THA=MAC-X, which is a physical MAC address of DR port “p2” 154 at second edge 701. See modified ARP response 770 (labelled “Q7”).
The MAC address transformation continues as ARP response 760 is sent towards second edge 701, first edge 120 and host-A 110A. In particular, at DR4144 of second edge 701, THA=MAC-X is replaced with THA=MAC-C, which is a physical MAC address of DR port “p2” 151 of DR1141 at first edge 120. See modified ARP response labelled “Q8” 780 sent towards first edge 120 over tunnel 703. Second edge 701 may also learn address mapping information (IP address=IP-VM5, MAC address=MAC-VM5) in cache 702.
At DR1141 of first edge 120, THA=MAC-C is replaced with THA=MAC-A, which is a physical MAC address of DR port “p2” 152 of DR2142 on host-A 110A. See modified ARP response labelled “Q9” 790 sent towards DR2142. First edge 120 may also learn address mapping information (IP address=IP-VM5, MAC address=MAC-VM5) in cache 705.
At host-A 110A, in response to receiving ARP response 760 via DR port “p2” 152 at DR2142, (IP-VM5=20.20.20.25, MAC-VM5) may be stored an ARP table (not shown in
(c) Address Resolution Suppression
Based on ARP table 750 at first edge 120, subsequent ARP requests to resolve the same IP address may be suppressed to reduce the amount of broadcast traffic. Referring now to
In response to detecting ARP request 810, edge 120 determines that SPA=IP-DR-p2 is also the IP address of DR port “p2” 151 at DR1141. Edge 120 then finds TPA=IP-VMS in cache 705. As such, ARP request 820 may be suppressed (see 830). Next, ARP response 840 (labelled “Q3”) is generated and sent to host-B 110B in a unicast manner. ARP response 840 specifies (SHA=MAC-VM5, SPA=IP-VMS, THA=MAC-B, TPA=IP-DR-p2).
According to examples of the present disclosure, a multi-level ARP proxy may be implemented using logical DR 140. In the examples in
Container Implementation
Although explained using VMs 131-135, it should be understood that SDN environment 100 may include other virtual workloads, such as containers, etc. As used herein, the term “container” (also known as “container instance”) is used generally to describe an application that is encapsulated with all its dependencies (e.g., binaries, libraries, etc.). In the examples in
Network Device
The above examples can be implemented by hardware (including hardware logic circuitry), software or firmware or a combination thereof. The above examples may be implemented by any suitable computing device, computer system, network device, etc. The network device may include processor(s), memory unit(s) and physical NIC(s) that may communicate with each other via a communication bus, etc. The network device may include a non-transitory computer-readable medium having stored thereon instructions or program code that, when executed by the processor, cause the processor to perform processes described herein with reference to
The techniques introduced above can be implemented in special-purpose hardwired circuitry, in software and/or firmware in conjunction with programmable circuitry, or in a combination thereof. Special-purpose hardwired circuitry may be in the form of, for example, one or more application-specific integrated circuits (ASICs), programmable logic devices (PLDs), field-programmable gate arrays (FPGAs), and others. The term ‘processor’ is to be interpreted broadly to include a processing unit, ASIC, logic unit, or programmable gate array etc.
The foregoing detailed description has set forth various embodiments of the devices and/or processes via the use of block diagrams, flowcharts, and/or examples. Insofar as such block diagrams, flowcharts, and/or examples contain one or more functions and/or operations, it will be understood by those within the art that each function and/or operation within such block diagrams, flowcharts, or examples can be implemented, individually and/or collectively, by a wide range of hardware, software, firmware, or any combination thereof.
Those skilled in the art will recognize that some aspects of the embodiments disclosed herein, in whole or in part, can be equivalently implemented in integrated circuits, as one or more computer programs running on one or more computers (e.g., as one or more programs running on one or more computing systems), as one or more programs running on one or more processors (e.g., as one or more programs running on one or more microprocessors), as firmware, or as virtually any combination thereof, and that designing the circuitry and/or writing the code for the software and or firmware would be well within the skill of one of skill in the art in light of this disclosure.
Software and/or other instructions to implement the techniques introduced here may be stored on a non-transitory computer-readable storage medium and may be executed by one or more general-purpose or special-purpose programmable microprocessors. A “computer-readable storage medium”, as the term is used herein, includes any mechanism that provides (i.e., stores and/or transmits) information in a form accessible by a machine (e.g., a computer, network device, personal digital assistant (PDA), mobile device, manufacturing tool, any device with a set of one or more processors, etc.). A computer-readable storage medium may include recordable/non recordable media (e.g., read-only memory (ROM), random access memory (RAM), magnetic disk or optical storage media, flash memory devices, etc.).
The drawings are only illustrations of an example, wherein the units or procedure shown in the drawings are not necessarily essential for implementing the present disclosure. Those skilled in the art will understand that the units in the device in the examples can be arranged in the device in the examples as described, or can be alternatively located in one or more devices different from that in the examples. The units in the examples described can be combined into one module or further divided into a plurality of sub-units.
The present application is a continuation under 35 U.S.C. § 120 of U.S. patent application Ser. No. 16/507,045, filed on Jul. 10, 2019, and entitled “ADDRESS RESOLUTION HANDLING AT LOGICAL DISTRIBUTED ROUTERS,” now issued as U.S. Pat. No. 11,463,398, which is incorporated herein by reference in its entirety.
Number | Name | Date | Kind |
---|---|---|---|
8139497 | Estrada et al. | Mar 2012 | B2 |
9531676 | Wang et al. | Dec 2016 | B2 |
9548965 | Wang et al. | Jan 2017 | B2 |
9575782 | Chandrashekhar et al. | Feb 2017 | B2 |
20150103842 | Chandrashekhar et al. | Apr 2015 | A1 |
20160094396 | Chandrashekhar et al. | Mar 2016 | A1 |
20170093618 | Chanda | Mar 2017 | A1 |
20190123962 | Guo | Apr 2019 | A1 |
20190312820 | Yu | Oct 2019 | A1 |
20210014192 | Yu et al. | Jan 2021 | A1 |
Entry |
---|
Vmware, Inc., “Add an Edge Services Gateway”, available at <URL: https://docs.vmware.com/en/VMware-NSX-Data-Center-for-vSphere/6.4/com.vmware.nsx.admin.doc/GUID-1EA25D37-F1C7-45C8-AEBA-A555ACC972BC.html>, May 31, 2019, 7 pages. |
Rashmi Bhardwaj, “Proxy ARP for Layer 2 Extension”, available at <URL: https://ipwithease.com/proxy-arp-for-layer-2-extension/>, Apr. 10, 2015, 8 pages. |
Number | Date | Country | |
---|---|---|---|
20220385621 A1 | Dec 2022 | US |
Number | Date | Country | |
---|---|---|---|
Parent | 16507045 | Jul 2019 | US |
Child | 17877247 | US |