Edge node cluster network redundancy and fast convergence using an underlay anycast VTEP IP

Information

  • Patent Grant
  • 11539574
  • Patent Number
    11,539,574
  • Date Filed
    Tuesday, September 24, 2019
    5 years ago
  • Date Issued
    Tuesday, December 27, 2022
    a year ago
Abstract
Some embodiments provide a method for providing redundancy and fast convergence for modules operating in a network. The method configures modules to use a same anycast inner IP address, anycast MAC address, and to associate with a same anycast VTEP IP address. In some embodiments, the modules are operating in an active-active mode and all nodes running modules advertise the anycast VTEP IP addresses with equal local preference. In some embodiments, modules are operating in active-standby mode and the node running the active module advertises the anycast VTEP IP address with higher local preference.
Description
BACKGROUND

Networks provide services at service nodes. Clusters of service nodes are often used to provide redundancy so that service is not interrupted upon the failure of a single service node. During a failover process, the process of switching from a failed node to a redundant node, packets destined for the failed node may be dropped or may not be provided with the service. Therefore, methods for providing redundancy while minimizing failover time are needed.


BRIEF SUMMARY

Some embodiments provide a method for providing redundancy and fast convergence to modules (e.g., service routers) executing in edge nodes. In some embodiments the method is performed by a management plane that centrally manages the network (e.g., implemented in a network controller). The method, in some embodiments configures a set of service routers executing in edge nodes to use a same anycast inner internet protocol (IP) address and a same anycast inner media access control (MAC) address. In some embodiments, the method configures edge nodes on which the set of service routers are executing to use a same set of anycast virtual extensible local area network tunnel endpoint (VTEP) IP addresses. The method in some embodiments configures edge nodes to advertise the anycast inner IP and anycast inner MAC address as reachable through at least one anycast VTEP IP address.


In some embodiments, the method configures the service routers to act in active-standby mode in which one service router acts as an active service router and other service routers act as standby service routers in case the active service router is no longer accessible (e.g., the active service router fails or a connection to the active service router fails). In some embodiments, the method accomplishes this by configuring an edge node on which an active service router executes to advertise the anycast VTEP IP address with a higher local preference. In these embodiments, when the edge node fails, a switch connected to the edge node advertises that the anycast VTEP IP address is no longer reachable at the edge node. In other embodiments, the method configures the modules in active-active mode by configuring all edge nodes to advertise the anycast VTEP IP address with the same local preference.


In some embodiments, the method takes advantage of convergence of an underlay network to decrease failover times for redundant modules. Convergence time in the underlay network in some embodiments is based on link-failure detection protocols (e.g., bidirectional forwarding detection (BFD)) between the physical switches and machines on which the modules execute (e.g., an edge node or the host on which an edge node executes). Such underlay network failure detection in most cases will be much faster than software based methods operating between modules (50 ms vs. 1 second). Faster detection and fast convergence times for the hardware allows for decreased failover time for the modules.


The preceding Summary is intended to serve as a brief introduction to some embodiments of the invention. It is not meant to be an introduction or overview of all inventive subject matter disclosed in this document. The Detailed Description that follows and the Drawings that are referred to in the Detailed Description will further describe the embodiments described in the Summary as well as other embodiments. Accordingly, to understand all the embodiments described by this document, a full review of the Summary, Detailed Description and the Drawings is needed. Moreover, the claimed subject matters are not to be limited by the illustrative details in the Summary, Detailed Description and the Drawing, but rather are to be defined by the appended claims, because the claimed subject matters can be embodied in other specific forms without departing from the spirit of the subject matters.





BRIEF DESCRIPTION OF THE DRAWINGS

The novel features of the invention are set forth in the appended claims. However, for purpose of explanation, several embodiments of the invention are set forth in the following figures.



FIG. 1 illustrates a network that includes service modules, edge nodes, leaf switches, and spine switches in which the invention operates.



FIG. 2 illustrates a set of service modules using a same set of anycast addresses in a system configured as in FIG. 1.



FIG. 3 illustrates a set of edge nodes implementing multiple pairs of service routers in active-standby mode using two anycast VTEP IP addresses.



FIG. 4 conceptually illustrates a process of some embodiments for configuring service modules to implement the invention.



FIG. 5 conceptually illustrates a process of some embodiments for implementing redundancy in case of service router failure.



FIG. 6 conceptually illustrates a process of some embodiments for implementing redundancy when service router comes back online.



FIG. 7 conceptually illustrates an electronic system with which some embodiments of the invention are implemented.





DETAILED DESCRIPTION

In the following description, numerous details are set forth for the purpose of explanation. However, one of ordinary skill in the art will realize that the invention may be practiced without the use of these specific details. In other instances, well-known structures and devices are shown in block diagram form in order not to obscure the description of the invention with unnecessary detail.


Some embodiments provide a method for providing redundancy and fast convergence to modules (e.g., service routers) executing in edge nodes. In some embodiments, the method is performed by a management plane that centrally manages the network (e.g., implemented in a network controller). The method, in some embodiments, configures a set of service routers executing in edge nodes to use a same anycast inner internet protocol (IP) address and a same anycast inner media access control (MAC) address. In some embodiments, the method configures edge nodes on which the set of service routers are executing to use a same set of anycast virtual extensible local area network tunnel endpoint (VTEP) IP addresses. The method configures edge nodes to advertise the anycast inner IP and anycast inner MAC address as reachable through at least one anycast VTEP IP address in some embodiments.



FIG. 1 illustrates a network system 100 in which some embodiments of the invention are implemented. FIG. 1 includes a number of host machines 101A-E, virtual extensible local area network tunnel endpoints (VTEPs) 102A-C, edge notes 105A-D, a service router 106, leaf switches 110A-D, hypervisor 115A, spine switches 120A-N, a data center fabric 130, and an external network 140. For simplicity, FIG. 1 only shows host machines 101A and B and edge node 105D with internal component, but one of ordinary skill in the art would understand that that other host machines and edge machines may contain similar, additional, or alternative elements.


Edge nodes 105A-D are connected to external network 140 and provide virtual machines or other data compute nodes connected to data center fabric 130 access to external network 140 by hosting service routers (e.g., service router 106). Service routers (SRs) may be implemented in a namespace, a virtual machine, or as a virtual routing and forwarding (VRF) module in different embodiments. Service routers provide routing services and, in some embodiments, a number of stateful (e.g., firewall, NAT, etc.) or stateless services (e.g., access control lists (ACLs)). In different embodiments, edge nodes 105A-D may be implemented as virtual machines (sometimes referred to as Edge VMs), in other types of data compute nodes (e.g., namespaces, physical hosts, etc.), or by using the Linux-based datapath development kit (DPDK) packet processing software (e.g., as a VRF in the DPDK-based datapath).


Edge nodes (e.g., edge node 105D) in some embodiments terminate tunnels (e.g., tunnels defined by a network manager). In some embodiments, some edge nodes (e.g., edge node 105C) make use of a VTEP of a host machine on which they execute while others implement their own VTEP when the edge node executes in a dedicated server. In some embodiments, edge nodes may be run on bare metal (e.g., directly on a server or host) or as a virtual machine form factor running on top of a hypervisor. One of ordinary skill in the art will understand that a network may include a number of edge nodes operating in any combination of the above modes.


Leaf physical switches 110C and 110D, in some embodiments, are part of data center fabric 130 and provide the VMs executing on host machines 101B-E access to spine switches 120 A-N and—through leaf physical switches 110A and 110B and edge nodes 105A-D—to external network 140. Leaf switches in some embodiments may be implemented as physical top of rack switches. In some embodiments, leaf switches and spine switches run internal gateway protocols (IGPs) (e.g., open shortest path first (OSPF), routing information protocol (RIP), intermediate system to intermediate system (IS-IS), etc.) to direct packets along a shortest path to a packet destination.


Host machines 101B-E in some embodiments host multiple VMs that use the edge nodes and service routers to access external network 140. VMs or other DCNs may be run on top of a hypervisor executing a managed switching element (not shown) that implements a VTEP and a virtual distributed router (VDR) to allow for overlay network and logical packet processing.


One of ordinary skill in the art would understand that the underlying network structure may be implemented in any number of ways that are consistent with the spirit of the invention. The particular network structure should not be construed as limiting the invention but is used solely for illustrative purposes.



FIG. 2 illustrates an instance of a set of service routers 206A-B executing on edge nodes 205A-B. FIG. 2 also illustrates a distributed router (DR) that spans edge nodes 205A-B and hypervisor 215. Hypervisor 215 also runs virtual machine 216 and terminates a tunnel at VTEP 202C. FIG. 2 also shows central controller 250 configuring a default route for DR 201 on hypervisor 215, and configures service routers 206A-B to use the anycast inner IP, MAC, and VTEP (outer) IP address. The DR, in some embodiments, spans managed forwarding elements (MFEs) that couple directly to VMs or other data compute nodes that are logically connected, directly or indirectly, to the logical router. The DR of some embodiments also spans the gateways to which the logical router is bound (e.g., edge nodes 205A-B). The DR is responsible for first-hop distributed routing between logical switches and/or other logical routers that are logically connected to the logical router. The SRs of some embodiments are responsible for delivering services that are not implemented in a distributed fashion (e.g., some stateful services).


Service routers 206A-B are shown connecting to logical switch 1. Service routers 206A-B in some embodiments are a subset of a set of SRs that provide access to external networks. Distributed router 201 connects to a plurality of logical switches (e.g., logical switches 1-N). Logical switches 2-N may be connected to VMs executing on any number of host machines. The VMs in some embodiments route traffic to an outside network through service routers 206A-B or a different set of SRs. Further details of possible configurations may be found in U.S. Non-Provisional patent application Ser. No. 14/814,473, published as United States Patent Publication 2016/0226754, which is hereby incorporated by reference.


As shown, SRs 206A-B are configured to use a same anycast inner IP address, anycast VTEP (outer) address, and anycast media access control (MAC) address. Anycast addressing allows a same address to be used for multiple destinations (in some embodiments of this invention the multiple destinations are redundant destinations). A packet sent to an anycast address is forwarded to a nearest node (also referred to as a closest node or along a shortest path) according to an IGP (e.g., OSPF, RIP, IS-IS, etc.). Such a nearest node along a route, in some embodiments, is calculated based on administrative distance values, used to determine priority, with larger values indicating lower priority types of route.


Service routers 206A-B may be implemented in active-active mode or active-standby mode. In active-active SRs are treated as equals for routing purposes (i.e., advertised with a same preference or administrative distance). Packet flows in active-active mode are directed to a particular SR based on some set of criteria (e.g., a load balancing criteria, equal-cost multi-pathing (ECMP), anycast routing protocol, etc.). Failure of a SR in active-active mode in some embodiments is detected by a bidirectional forwarding detection (BFD) session running between an edge node on which the SR executes and a switch to which the edge node is connected. After detection of the failure the switch no longer advertises the availability of the service router at the edge node for which the connection failed and the underlay network converges on remaining SRs as the shortest or lowest cost path to the anycast IP address. The system thus achieves redundancy and fast convergence by using an IGP and a same anycast address for all SRs. The specific steps are further discussed in relation to FIGS. 4-6.


In some embodiments of the invention, SRs 206A-B are implemented in active-standby mode. In active-standby mode one service router in a set of service routers is configured to act as the active service router to which traffic destined for the set of service routers is directed. Such an active-standby mode may be useful when service routers provide stateful services that require a single service router to provide a set of services for each packet flow. In some situations, the active service router in such an active-standby mode maintains the state of all the flows. In some embodiments state information is periodically pushed to (or pulled by) the standby service routers to provide the set of services if the active service router fails. In active-standby mode the edge node hosting the active SR advertises its anycast VTEP IP address with a higher preference (e.g., lower administrative distance) than the edge node hosting the standby SR such that the active SR is always the “nearest” SR when both SRs are available.


In order to provide redundancy with fast convergence, some embodiments use bidirectional forwarding detection (BFD) or similar protocols for monitoring whether connections between leaf switches and edge nodes (and ultimately to the service routers) are functioning. Such protocols can have very short detection times (e.g., 50 ms). If a connection to an edge node hosting an active SR fails, routing information for the anycast addresses used by the active (failed) SR and the underlying network can converge on the use of the standby SR (as the available SR with the anycast addresses) at the same rate as the underlying network convergence, which in some embodiments is considerably faster than methods that rely on communication between the SRs. In some embodiments, the update is based on a notification from the BFD session.


This fast convergence can be contrasted with a process that relies on BFD sessions that run between SRs. BFD sessions that run between SRs to detect SR failure in both active-active and active-standby mode send packets less aggressively in order to avoid a false positive (i.e., detecting a failure that has not happened) based on a link failure in the underlay network that is subsequently corrected within an acceptable time. Even after a BFD session detects an actual failure, the SR must perform software processes to determine the appropriate action to take and then must send out a gratuitous address resolution protocol (GARP) packet that alerts all the servers and hypervisors of the new association of the MAC address of the failed SR with the IP address of the alternative SR. This process can take ten seconds or more because of the large number (hundreds or even thousands) of servers attached to a particular leaf physical switch as opposed to the method using the underlay network convergence which relies only on advertising the updated anycast address availability to the physical switches in the DC fabric.


One of ordinary skill in the art would understand that this method could be used to provide redundancy with fast convergence time to other types of module clusters that provide stateful or stateless services and is not limited to service routers in edge nodes.


Distributed router 201 is illustrated as spanning edge nodes 205A-B and as an element of hypervisor 215, however as described above DR (or virtual DR (VDR)) is a logical router that is implemented by managed forwarding elements executing on the hosts that are not shown in FIG. 2. In some embodiments, DR 201 is configured to use as its default gateway the anycast inner IP address of the service routers 206A-B. In these embodiments, the anycast inner IP address is associated with the anycast MAC address and the anycast MAC address is associated with the anycast VTEP (outer) IP address. Configuring DR 201 to send packets to the anycast VTEP allows the underlying network to calculate the nearest node once the packet reaches the leaf switch 210C as discussed above.


VTEPs 202A-B are depicted as being part of pNICs 203A-B respectively because VTEPs advertise their availability on the IP address of the pNIC for the host machine. However, it is to be understood that in reality a VTEP is a function provided by a hypervisor or managed forwarding element on a hypervisor. Additionally, VTEPs are depicted as being connected to logical switch 1 to demonstrate that packets destined for the service routers on logical switch 1 are reachable by the VTEP acting on the same machine, and the VTEPs are also logically connected to any logical switch with DCNs running on the same machine or hypervisor.



FIG. 3 illustrates a configuration of edge nodes 305A-B on which multiple pairs of SRs execute. The use of two anycast VTEP addresses in some embodiments allows a single edge node to execute some SRs as active while other SRs are standby SRs. As in the example above, SR pairs in active-standby mode share anycast inner IP and MAC addresses (shown for SR 1 and SR 3) as well as anycast VTEP IP address. As shown, SR pairs that have the active SR on edge node 305A share VTEP1 as their anycast VTEP IP while SR pairs that have the active SR on edge node 305B share VTEP2 as their anycast VTEP IP. Such a configuration allows edge node 305A to advertise VTEP1 with higher preference and edge node 305B to advertise VTEP2 with higher preference such that the active SRs receive the traffic on both edge nodes.



FIG. 4 conceptually illustrates a process 400 that implements the novel method of some embodiments of the invention. In some embodiments, the process 400 is performed by a central controller or central controller cluster that manages forwarding elements on different hosts to implement logical networks and distributed routers. The controller performs this process in some embodiments upon an initial configuration of service routers and in other embodiments in response to a change in the network settings.


As shown, process 400 begins when a controller configures (at 410) a set of service routers to use a same anycast inner IP address. The anycast inner IP address is found in the inner packet header that is encapsulated according to a tunneling protocol (e.g., GRE, VXLAN, etc.). The process 400 then configures (at 420) the set of service routers to use a same anycast MAC address.


Process 400 continues by configuring (at 430) a set of edge nodes to use a same anycast VTEP (outer) IP address. One of ordinary skill in the art will appreciate that a VTEP IP address is just one example of an outer IP address that may be used in a tunneling protocol and that other outer IP addresses would function in similar manners. It is to be understood that the steps 410-430 may be performed in any order and that the separate steps are not dependent on one another.


The process determines (at 440) whether the service routers are to be configured in active-standby mode or in active-active mode. If the service routers are to be configured in active-standby mode, the process configures (at 450) the edge router on which the active service router executes to advertise the VTEP IP address with higher preference (e.g. lower administrative cost). If the process determines (at 440) that the service routers should be configured in active-active mode, the process (at 455) configures the edge nodes to advertise the VTEP IP address with a same preference (e.g., same administrative cost).


In both active-active and active-standby modes the process configures (at 460) distributed routers to use the anycast VTEP IP address to send outbound packets. As noted above in the discussion of FIG. 2, in some embodiments the process configures the default route of the DR to direct packets to the anycast inner IP address by associating the anycast inner IP address with the anycast MAC address and anycast VTEP IP address used by the service router.



FIG. 5 conceptually illustrates a process 500 that implements the novel method of some embodiments of the invention. The process 500 in some embodiments is implemented by a leaf switch that is connected to an edge node on which an active service router is executing. The process allows fast detection of service-router unavailability (e.g., link between leaf switch and edge node fails, failure of edge node, etc.). Process 500 assumes that a bidirectional forwarding detection (BFD) or similar protocol for monitoring the status of the connection between the leaf switch and an edge node hosting a service router or routers has been established.


Process 500 begins by detecting (at 510) that a connection between a switch and an edge node has failed. In some embodiments, the connection is between a leaf switch and the edge node executing the active service router. One of ordinary skill in the art will recognize that the detection could be placed along any link that would leave the service router unavailable to other machines on the network.


Process 500 continues by having the switch remove (at 520) the association between the edge node and the anycast VTEP IP address from the forwarding table of the switch (e.g., based on a routing protocol). The switch no longer forwards packets with the anycast VTEP IP address to the edge node and the process proceeds to step 530.


Process 500 then determines (at 530) whether the anycast VTEP IP is still accessible on other machines (edge nodes) connected to the switch. If no other edge nodes connected to the switch advertise the availability of the anycast VTEP IP, then the process advertises (at 540) that the anycast VTEP IP is no longer available through the switch. Once this information is propagated through the DC fabric using a dynamic routing protocol, the underlying network begins sending packets to the next available service router (e.g., the standby router) and the process ends. If there are still edge nodes connected to the switch advertising the availability of the anycast VTEP IP, the switch does not need to advertise that the anycast VTEP IP address is not available through the switch and the process ends.



FIG. 6 conceptually illustrates a process 600 that allows a previously active service router to come back online as the active service router when availability is restored. The process 600 in some embodiments is implemented by a leaf switch that is connected to an edge node on which a formerly-active service router executes.


Process 600 begins by detecting (at 610) that a connection between a switch and an edge node has been restored. In some embodiments, the connection is between a leaf switch and the edge node executing the formerly-active service router. One of ordinary skill in the art will recognize that the detection could be placed along any link that would leave the service router unavailable to other machines on the network.


Process 600 continues by having the switch add (at 620) the association between the edge node and the anycast VTEP IP address to the forwarding table of the switch (e.g., based on a routing protocol). The addition in some embodiments is based on receiving an advertisement from the edge node that the anycast VTEP IP is available over the restored link. In some embodiments, the switch now forwards packets with the anycast VTEP IP address to the restored edge node and the process proceeds to step 630.


Process 600 then determines (at 630) whether the anycast VTEP IP was accessible on other machines (edge nodes) connected to the switch before the restoration of the connection. If no other edge nodes connected to the switch advertise the availability of the anycast VTEP IP, then the process advertises (at 640) that the anycast VTEP IP is now available through the switch. Once this information is propagated through the DC fabric using a dynamic routing protocol, the underlying network begins sending packets to the restored service router (e.g., the failed and restored active service router) and the process ends. If there were still edge nodes connected to the switch advertising the availability of the anycast VTEP IP, the switch does not need to advertise that the anycast VTEP IP address is now available through the switch and the process ends. In some embodiments, the edge node with the restored connection advertises the anycast VTEP IP address with higher preference as it had been doing before the failure and restoration.


Electronic System


Many of the above-described features and applications are implemented as software processes that are specified as a set of instructions recorded on a computer readable storage medium (also referred to as computer readable medium). When these instructions are executed by one or more processing unit(s) (e.g., one or more processors, cores of processors, or other processing units), they cause the processing unit(s) to perform the actions indicated in the instructions. Examples of computer readable media include, but are not limited to, CD-ROMs, flash drives, RAM chips, hard drives, EPROMs, etc. The computer readable media does not include carrier waves and electronic signals passing wirelessly or over wired connections.


In this specification, the term “software” is meant to include firmware residing in read-only memory or applications stored in magnetic storage, which can be read into memory for processing by a processor. Also, in some embodiments, multiple software inventions can be implemented as sub-parts of a larger program while remaining distinct software inventions. In some embodiments, multiple software inventions can also be implemented as separate programs. Finally, any combination of separate programs that together implement a software invention described here is within the scope of the invention. In some embodiments, the software programs, when installed to operate on one or more electronic systems, define one or more specific machine implementations that execute and perform the operations of the software programs.



FIG. 7 conceptually illustrates an electronic system 700 with which some embodiments of the invention are implemented. The electronic system 700 can be used to execute any of the control, virtualization, or operating system applications described above. The electronic system 700 may be a computer (e.g., a desktop computer, personal computer, tablet computer, server computer, mainframe, a blade computer etc.), phone, PDA, or any other sort of electronic device. Such an electronic system includes various types of computer readable media and interfaces for various other types of computer readable media. Electronic system 700 includes a bus 705, processing unit(s) 710, a system memory 725, a read-only memory 730, a permanent storage device 735, input devices 740, and output devices 745.


The bus 705 collectively represents all system, peripheral, and chipset buses that communicatively connect the numerous internal devices of the electronic system 700. For instance, the bus 705 communicatively connects the processing unit(s) 710 with the read-only memory 730, the system memory 725, and the permanent storage device 735.


From these various memory units, the processing unit(s) 710 retrieve instructions to execute and data to process in order to execute the processes of the invention. The processing unit(s) may be a single processor or a multi-core processor in different embodiments.


The read-only-memory (ROM) 730 stores static data and instructions that are needed by the processing unit(s) 710 and other modules of the electronic system. The permanent storage device 735, on the other hand, is a read-and-write memory device. This device is a non-volatile memory unit that stores instructions and data even when the electronic system 700 is off. Some embodiments of the invention use a mass-storage device (such as a magnetic or optical disk and its corresponding disk drive) as the permanent storage device 735.


Other embodiments use a removable storage device (such as a floppy disk, flash drive, etc.) as the permanent storage device. Like the permanent storage device 735, the system memory 725 is a read-and-write memory device. However, unlike storage device 735, the system memory is a volatile read-and-write memory, such a random access memory. The system memory stores some of the instructions and data that the processor needs at runtime. In some embodiments, the invention's processes are stored in the system memory 725, the permanent storage device 735, and/or the read-only memory 730. From these various memory units, the processing unit(s) 710 retrieve instructions to execute and data to process in order to execute the processes of some embodiments.


The bus 705 also connects to the input and output devices 740 and 745. The input devices enable the user to communicate information and select commands to the electronic system. The input devices 740 include alphanumeric keyboards and pointing devices (also called “cursor control devices”). The output devices 745 display images generated by the electronic system. The output devices include printers and display devices, such as cathode ray tubes (CRT) or liquid crystal displays (LCD). Some embodiments include devices such as a touchscreen that function as both input and output devices.


Finally, as shown in FIG. 7, bus 705 also couples electronic system 700 to a network 765 through a network adapter (not shown). In this manner, the computer can be a part of a network of computers (such as a local area network (“LAN”), a wide area network (“WAN”), or an Intranet, or a network of networks, such as the Internet. Any or all components of electronic system 700 may be used in conjunction with the invention.


Some embodiments include electronic components, such as microprocessors, storage and memory that store computer program instructions in a machine-readable or computer-readable medium (alternatively referred to as computer-readable storage media, machine-readable media, or machine-readable storage media). Some examples of such computer-readable media include RAM, ROM, read-only compact discs (CD-ROM), recordable compact discs (CD-R), rewritable compact discs (CD-RW), read-only digital versatile discs (e.g., DVD-ROM, dual-layer DVD-ROM), a variety of recordable/rewritable DVDs (e.g., DVD-RAM, DVD-RW, DVD+RW, etc.), flash memory (e.g., SD cards, mini-SD cards, micro-SD cards, etc.), magnetic and/or solid state hard drives, read-only and recordable Blu-Ray® discs, ultra density optical discs, any other optical or magnetic media, and floppy disks. The computer-readable media may store a computer program that is executable by at least one processing unit and includes sets of instructions for performing various operations. Examples of computer programs or computer code include machine code, such as is produced by a compiler, and files including higher-level code that are executed by a computer, an electronic component, or a microprocessor using an interpreter.


While the above discussion primarily refers to microprocessor or multi-core processors that execute software, some embodiments are performed by one or more integrated circuits, such as application specific integrated circuits (ASICs) or field programmable gate arrays (FPGAs). In some embodiments, such integrated circuits execute instructions that are stored on the circuit itself.


As used in this specification, the terms “computer”, “server”, “processor”, and “memory” all refer to electronic or other technological devices. These terms exclude people or groups of people. For the purposes of the specification, the terms display or displaying means displaying on an electronic device. As used in this specification, the terms “computer readable medium,” “computer readable media,” and “machine readable medium” are entirely restricted to tangible, physical objects that store information in a form that is readable by a computer. These terms exclude any wireless signals, wired download signals, and any other ephemeral signals.


This specification refers throughout to computational and network environments that include virtual machines (VMs). However, virtual machines are merely one example of data compute nodes (DCNs) or data compute end nodes, also referred to as addressable nodes. DCNs may include non-virtualized physical hosts, virtual machines, containers that run on top of a host operating system without the need for a hypervisor or separate operating system, and hypervisor kernel network interface modules.


VMs, in some embodiments, operate with their own guest operating systems on a host using resources of the host virtualized by virtualization software (e.g., a hypervisor, virtual machine monitor, etc.). The tenant (i.e., the owner of the VM) can choose which applications to operate on top of the guest operating system. Some containers, on the other hand, are constructs that run on top of a host operating system without the need for a hypervisor or separate guest operating system. In some embodiments, the host operating system uses name spaces to isolate the containers from each other and therefore provides operating-system level segregation of the different groups of applications that operate within different containers. This segregation is akin to the VM segregation that is offered in hypervisor-virtualized environments that virtualize system hardware, and thus can be viewed as a form of virtualization that isolates different groups of applications that operate in different containers. Such containers are more lightweight than VMs.


Hypervisor kernel network interface modules, in some embodiments, is a non-VM DCN that includes a network stack with a hypervisor kernel network interface and receive/transmit threads. One example of a hypervisor kernel network interface module is the vmknic module that is part of the ESXi™ hypervisor of VMware, Inc.


It should be understood that while the specification refers to VMs, the examples given could be any type of DCNs, including physical hosts, VMs, non-VM containers, and hypervisor kernel network interface modules. In fact, the example networks could include combinations of different types of DCNs in some embodiments.


While the invention has been described with reference to numerous specific details, one of ordinary skill in the art will recognize that the invention can be embodied in other specific forms without departing from the spirit of the invention. In addition, a number of the figures (including FIGS. 4-6) conceptually illustrate processes. The specific operations of these processes may not be performed in the exact order shown and described. The specific operations may not be performed in one continuous series of operations, and different specific operations may be performed in different embodiments. Furthermore, the process could be implemented using several sub-processes, or as part of a larger macro process. Thus, one of ordinary skill in the art would understand that the invention is not to be limited by the foregoing illustrative details, but rather is to be defined by the appended claims.

Claims
  • 1. A method for configuring a plurality of host computers to implement a logical network in a datacenter, the logical network comprising a centralized logical router and a distributed logical router, the method comprising: configuring each edge node of a set of edge nodes that connect the datacenter to external networks to implement a centralized logical router, wherein each of the centralized logical routers uses a same first logical anycast network address;configuring (i) each of the edge nodes to use a same second physical anycast tunnel endpoint network address for a tunnel endpoint network address of the edge node and (ii) a particular edge node on which an active centralized logical router executes to advertise the second physical anycast tunnel endpoint network address to datacenter forwarding elements using a higher preference than other edge nodes in the set of edge nodes; andconfiguring each of a set of managed forwarding elements that execute on host computers to implement the distributed logical router (1) to use the first logical anycast network address of the centralized logical routers as a default gateway for the distributed logical router and (2) to send data messages that need to reach the centralized logical routers implemented by the edge node to the datacenter forwarding elements using the second physical anycast tunnel endpoint network address, the datacenter forwarding elements routing the data messages to the particular edge node.
  • 2. The method of claim 1, wherein the active centralized logical router is a first centralized logical router and the particular edge node is a first edge node, wherein upon failure of a connection to the first centralized logical router the datacenter forwarding elements route the data messages for the centralized logical routers to a second edge node on which a second centralized logical router executes.
  • 3. The method of claim 2, wherein the failure of the connection to the first centralized logical router is detected by a fault detection protocol session between the first edge node and one of the datacenter forwarding elements.
  • 4. The method of claim 2, wherein upon failure of the connection to the first centralized logical router, a datacenter forwarding element that detects the connection advertises the second physical anycast tunnel endpoint network address as unavailable at the first edge node.
  • 5. The method of claim 1 further comprising configuring the managed forwarding elements implementing the distributed logical router to associate data messages routed to the first logical anycast network address with the second physical anycast tunnel endpoint network address.
  • 6. The method of claim 1, wherein: the centralized logical routers that use the same first logical anycast network address are a first set of centralized logical routers;the particular edge node implements the active centralized logical router for the first set of centralized logical routers and an active centralized logical router for a second set of centralized logical routers; andthe first logical anycast network address used by the first set of centralized logical routers and a third anycast network address associated with the second set of centralized logical routers are both associated with the second physical anycast tunnel endpoint network address.
  • 7. The method of claim 6, wherein: the particular edge node implements a standby centralized logical router for a third set of centralized logical routers that use a fourth anycast network address;the particular edge node uses both the second physical anycast tunnel endpoint network address and a fifth anycast tunnel endpoint network address; andthe fourth anycast network address is associated with the fifth anycast tunnel endpoint network address.
  • 8. A non-transitory machine readable medium storing a program which when executed by at least one processing unit configures a plurality of host computers to implement a logical network in a datacenter, the program comprising sets of instructions for: configuring each edge node of a set of edge nodes that connect the datacenter to external networks to implement a centralized logical router, wherein each of the centralized logical routers uses a same first logical anycast network address;configuring (i) each of the edge nodes to use a same second physical anycast tunnel endpoint network address for a tunnel endpoint network address of the edge node and (ii) a particular edge node on which an active centralized logical router executes to advertise the second physical anycast tunnel endpoint network address to datacenter forwarding elements using a higher preference than other edge nodes in the set of edge nodes; andconfiguring each of a set of managed forwarding elements that execute on host computers to implement the distributed logical router (1) to use the first logical anycast network address of the centralized logical routers as a default gateway for the distributed logical router and (2) to send data messages that need to reach the centralized logical routers implemented by the edge node to the datacenter forwarding elements using the second physical anycast tunnel endpoint network address, the datacenter forwarding elements routing the data messages to the particular edge node.
  • 9. The non-transitory machine readable medium of claim 8, wherein the active centralized logical router is a first centralized logical router and the particular edge node is a first edge node, wherein upon failure of a connection to the first centralized logical router the datacenter forwarding elements route the data messages for the centralized logical routers to a second edge node on which a second centralized logical router executes.
  • 10. The non-transitory machine readable medium of claim 9, wherein the failure of the connection to the first centralized logical router is detected by a fault detection protocol session between the first edge node and one of the datacenter forwarding elements.
  • 11. The non-transitory machine readable medium of claim 9, wherein upon failure of the connection to the first centralized logical router, a datacenter forwarding element that detects the connection advertises the second physical anycast tunnel endpoint network address as unavailable at the first edge node.
  • 12. The non-transitory machine readable medium of claim 8, wherein the program further comprises a set of instructions for configuring the managed forwarding elements implementing the distributed logical router to associate data messages routed to the first logical anycast network address with the second physical anycast tunnel endpoint network address.
  • 13. The non-transitory machine readable medium of claim 8, wherein: the centralized logical routers that use the same first logical anycast network address are a first set of centralized logical routers;the particular edge node implements the active centralized logical router for the first set of centralized logical routers and an active centralized logical router for a second set of centralized logical routers; andthe first logical anycast network address used by the first set of centralized logical routers and a third anycast network address associated with the second set of centralized logical routers are both associated with the second physical anycast tunnel endpoint network address.
  • 14. The non-transitory machine readable medium of claim 13, wherein: the particular edge node implements a standby centralized logical router for a third set of centralized logical routers that use a fourth anycast network address;the particular edge node uses both the second physical anycast tunnel endpoint network address and a fifth anycast tunnel endpoint network address; andthe fourth anycast network address is associated with the fifth anycast tunnel endpoint network address.
CLAIM OF BENEFIT TO PRIOR APPLICATIONS

The present Application is a continuation application of U.S. patent application Ser. No. 15/443,974, filed Feb. 27, 2017, now published as U.S. Patent Publication 2018/0062914. U.S. patent application Ser. No. 15/443,974 claims the benefit of U.S. Provisional Patent Application 62/382,229, filed Aug. 31, 2016. U.S. patent application Ser. No. 15/443,974, now published as U.S. Patent Publication 2018/0062914, is incorporated herein by reference.

US Referenced Citations (407)
Number Name Date Kind
5504921 Dev et al. Apr 1996 A
5550816 Hardwick et al. Aug 1996 A
5751967 Raab et al. May 1998 A
6006275 Picazo et al. Dec 1999 A
6104699 Holender et al. Aug 2000 A
6219699 McCloghrie et al. Apr 2001 B1
6359909 Ito et al. Mar 2002 B1
6456624 Eccles et al. Sep 2002 B1
6512745 Abe et al. Jan 2003 B1
6539432 Taguchi et al. Mar 2003 B1
6680934 Cain Jan 2004 B1
6785843 McRae et al. Aug 2004 B1
6914907 Bhardwaj et al. Jul 2005 B1
6941487 Balakrishnan et al. Sep 2005 B1
6950428 Horst et al. Sep 2005 B1
6963585 Pennec et al. Nov 2005 B1
6977924 Skoog Dec 2005 B1
6999454 Crump Feb 2006 B1
7046630 Abe et al. May 2006 B2
7107356 Baxter et al. Sep 2006 B2
7197572 Matters et al. Mar 2007 B2
7200144 Terrell et al. Apr 2007 B2
7209439 Rawlins et al. Apr 2007 B2
7260648 Tingley et al. Aug 2007 B2
7283473 Arndt et al. Oct 2007 B2
7342916 Das et al. Mar 2008 B2
7391771 Orava et al. Jun 2008 B2
7447197 Terrell et al. Nov 2008 B2
7450598 Chen et al. Nov 2008 B2
7463579 Lapuh et al. Dec 2008 B2
7478173 Delco Jan 2009 B1
7483411 Weinstein et al. Jan 2009 B2
7555002 Arndt et al. Jun 2009 B2
7606260 Oguchi et al. Oct 2009 B2
7630358 Lakhani et al. Dec 2009 B1
7643488 Khanna et al. Jan 2010 B2
7649851 Takashige et al. Jan 2010 B2
7653747 Lucco et al. Jan 2010 B2
7710874 Balakrishnan et al. May 2010 B2
7742459 Kwan et al. Jun 2010 B2
7764599 Doi et al. Jul 2010 B2
7778268 Khan et al. Aug 2010 B2
7792097 Wood et al. Sep 2010 B1
7792987 Vohra et al. Sep 2010 B1
7802000 Huang et al. Sep 2010 B1
7818452 Matthews et al. Oct 2010 B2
7826482 Minei et al. Nov 2010 B1
7839847 Nadeau et al. Nov 2010 B2
7881208 Nosella Feb 2011 B1
7885276 Lin Feb 2011 B1
7936770 Frattura et al. May 2011 B1
7937438 Miller et al. May 2011 B1
7948986 Ghosh et al. May 2011 B1
7953865 Miller et al. May 2011 B1
7987506 Khalid et al. Jul 2011 B1
7991859 Miller et al. Aug 2011 B1
7995483 Bayar et al. Aug 2011 B1
3027260 Venugopal et al. Sep 2011 A1
8027354 Portolani et al. Sep 2011 B1
8031633 Bueno et al. Oct 2011 B2
8046456 Miller et al. Oct 2011 B1
8054832 Shukla et al. Nov 2011 B1
8055789 Richardson et al. Nov 2011 B2
8060875 Lambeth Nov 2011 B1
8131852 Miller et al. Mar 2012 B1
8149737 Metke et al. Apr 2012 B2
8155028 Abu-Hamdeh et al. Apr 2012 B2
8166201 Richardson et al. Apr 2012 B2
8194674 Pagel et al. Jun 2012 B1
8199750 Schultz et al. Jun 2012 B1
8223668 Allan et al. Jul 2012 B2
8224931 Brandwine et al. Jul 2012 B1
8224971 Miller et al. Jul 2012 B1
8239572 Brandwine et al. Aug 2012 B1
8259571 Raphel et al. Sep 2012 B1
8265075 Pandey Sep 2012 B2
8281067 Stolowitz Oct 2012 B2
8312129 Miller et al. Nov 2012 B1
8339959 Moisand et al. Dec 2012 B1
8339994 Gnanasekaran et al. Dec 2012 B2
8345650 Foxworthy et al. Jan 2013 B2
8351418 Zhao et al. Jan 2013 B2
8370834 Edwards et al. Feb 2013 B2
8416709 Marshall et al. Apr 2013 B1
8456984 Ranganathan et al. Jun 2013 B2
8504718 Wang et al. Aug 2013 B2
8559324 Brandwine et al. Oct 2013 B1
8565108 Marshall et al. Oct 2013 B1
8600908 Lin et al. Dec 2013 B2
8611351 Gooch et al. Dec 2013 B2
8612627 Brandwine Dec 2013 B1
8625594 Safrai et al. Jan 2014 B2
8625603 Ramakrishnan et al. Jan 2014 B1
8625616 Vobbilisetty et al. Jan 2014 B2
8627313 Edwards et al. Jan 2014 B2
8644188 Brandwine et al. Feb 2014 B1
8660129 Brendel et al. Feb 2014 B1
8705513 Merwe et al. Apr 2014 B2
8724456 Hong et al. May 2014 B1
8745177 Kazerani et al. Jun 2014 B1
8958298 Zhang et al. Feb 2015 B2
9021066 Singh et al. Apr 2015 B1
9032095 Traina May 2015 B1
9036504 Miller May 2015 B1
9036639 Zhang May 2015 B2
9059999 Koponen et al. Jun 2015 B2
9137052 Koponen et al. Sep 2015 B2
9313129 Ganichev et al. Apr 2016 B2
9363172 Luxenberg et al. Jun 2016 B2
9385925 Scholl Jul 2016 B1
9419855 Ganichev et al. Aug 2016 B2
9454392 Luxenberg et al. Sep 2016 B2
9485149 Traina et al. Nov 2016 B1
9503321 Neginhal et al. Nov 2016 B2
9559980 Li et al. Jan 2017 B2
9647883 Neginhal et al. May 2017 B2
9749214 Han Aug 2017 B2
9787605 Zhang et al. Oct 2017 B2
9948472 Drake Apr 2018 B2
10057157 Goliya et al. Aug 2018 B2
10075363 Goliya et al. Sep 2018 B2
10079779 Zhang et al. Sep 2018 B2
10095535 Dubey et al. Oct 2018 B2
10110431 Ganichev et al. Oct 2018 B2
10129142 Goliya et al. Nov 2018 B2
10129180 Zhang et al. Nov 2018 B2
10153973 Dubey Dec 2018 B2
10230629 Masurekar et al. Mar 2019 B2
10270687 Mithyantha Apr 2019 B2
10341236 Boutros et al. Jul 2019 B2
10382321 Boyapati et al. Aug 2019 B1
10411955 Neginhal et al. Sep 2019 B2
10454758 Boutros et al. Oct 2019 B2
10601700 Goliya et al. Mar 2020 B2
10623322 Nallamothu Apr 2020 B1
10700996 Zhang et al. Jun 2020 B2
10749801 Dubey Aug 2020 B2
10795716 Dubey et al. Oct 2020 B2
10797998 Basavaraj et al. Oct 2020 B2
10805212 Masurekar et al. Oct 2020 B2
10911360 Boutros et al. Feb 2021 B2
10931560 Goliya et al. Feb 2021 B2
10938788 Wang et al. Mar 2021 B2
11252024 Neginhal et al. Feb 2022 B2
11283731 Zhang et al. Mar 2022 B2
20010043614 Viswanadham et al. Nov 2001 A1
20020067725 Oguchi et al. Jun 2002 A1
20020093952 Gonda Jul 2002 A1
20020194369 Rawlins et al. Dec 2002 A1
20030041170 Suzuki Feb 2003 A1
20030058850 Rangarajan et al. Mar 2003 A1
20030067924 Choe et al. Apr 2003 A1
20030069972 Yoshimura et al. Apr 2003 A1
20040013120 Shen Jan 2004 A1
20040073659 Rajsic et al. Apr 2004 A1
20040098505 Clemmensen May 2004 A1
20040240455 Shen Dec 2004 A1
20040267866 Carollo et al. Dec 2004 A1
20050018669 Arndt et al. Jan 2005 A1
20050027881 Figueira et al. Feb 2005 A1
20050053079 Havala Mar 2005 A1
20050083953 May Apr 2005 A1
20050120160 Plouffe et al. Jun 2005 A1
20050132044 Guingo et al. Jun 2005 A1
20060002370 Rabie et al. Jan 2006 A1
20060018253 Windisch et al. Jan 2006 A1
20060026225 Canali et al. Feb 2006 A1
20060029056 Perera et al. Feb 2006 A1
20060050719 Barr et al. Mar 2006 A1
20060056412 Page Mar 2006 A1
20060059253 Goodman et al. Mar 2006 A1
20060092940 Ansari et al. May 2006 A1
20060092976 Lakshman et al. May 2006 A1
20060174087 Hashimoto et al. Aug 2006 A1
20060187908 Shimozono et al. Aug 2006 A1
20060193266 Siddha et al. Aug 2006 A1
20060203774 Carrion-Rodrigo Sep 2006 A1
20060291387 Kimura et al. Dec 2006 A1
20060291388 Amdahl et al. Dec 2006 A1
20070043860 Pabari Feb 2007 A1
20070064673 Bhandaru et al. Mar 2007 A1
20070140128 Klinker et al. Jun 2007 A1
20070156919 Potti et al. Jul 2007 A1
20070165515 Vasseur Jul 2007 A1
20070201357 Smethurst et al. Aug 2007 A1
20070206591 Doviak et al. Sep 2007 A1
20070297428 Bose et al. Dec 2007 A1
20080002579 Lindholm et al. Jan 2008 A1
20080002683 Droux et al. Jan 2008 A1
20080013474 Nagarajan et al. Jan 2008 A1
20080049621 McGuire et al. Feb 2008 A1
20080049646 Lu Feb 2008 A1
20080059556 Greenspan et al. Mar 2008 A1
20080071900 Hecker et al. Mar 2008 A1
20080086726 Griffith et al. Apr 2008 A1
20080151893 Nordmark et al. Jun 2008 A1
20080159301 Heer Jul 2008 A1
20080186968 Farinacci Aug 2008 A1
20080189769 Casado et al. Aug 2008 A1
20080225853 Melman et al. Sep 2008 A1
20080240122 Richardson et al. Oct 2008 A1
20080253366 Zuk et al. Oct 2008 A1
20080253396 Olderdissen Oct 2008 A1
20080291910 Tadimeti et al. Nov 2008 A1
20090031041 Clemmensen Jan 2009 A1
20090043823 Iftode et al. Feb 2009 A1
20090064305 Stiekes et al. Mar 2009 A1
20090067427 Rezaki Mar 2009 A1
20090083445 Ganga Mar 2009 A1
20090092043 Lapuh Apr 2009 A1
20090092137 Haigh et al. Apr 2009 A1
20090122710 Bar-Tor et al. May 2009 A1
20090150527 Tripathi et al. Jun 2009 A1
20090161547 Riddle et al. Jun 2009 A1
20090249470 Litvin et al. Oct 2009 A1
20090249473 Cohn Oct 2009 A1
20090252173 Sampath et al. Oct 2009 A1
20090257440 Yan et al. Oct 2009 A1
20090262741 Jungck et al. Oct 2009 A1
20090279536 Unbehagen et al. Nov 2009 A1
20090292858 Lambeth et al. Nov 2009 A1
20090300210 Ferris Dec 2009 A1
20090303880 Maltz et al. Dec 2009 A1
20100002722 Porat et al. Jan 2010 A1
20100046531 Louati et al. Feb 2010 A1
20100107162 Edwards et al. Apr 2010 A1
20100115101 Lain et al. May 2010 A1
20100131636 Suri et al. May 2010 A1
20100153554 Anschutz et al. Jun 2010 A1
20100153701 Shenoy et al. Jun 2010 A1
20100162036 Linden et al. Jun 2010 A1
20100165877 Shukla et al. Jul 2010 A1
20100169467 Shukla et al. Jul 2010 A1
20100192225 Ma et al. Jul 2010 A1
20100205479 Akutsu et al. Aug 2010 A1
20100214949 Smith et al. Aug 2010 A1
20100257263 Casado et al. Oct 2010 A1
20100275199 Smith et al. Oct 2010 A1
20100290485 Martini et al. Nov 2010 A1
20100317376 Anisimov et al. Dec 2010 A1
20100318609 Lahiri et al. Dec 2010 A1
20100322255 Hao et al. Dec 2010 A1
20110016215 Wang Jan 2011 A1
20110022695 Dalal et al. Jan 2011 A1
20110026537 Kolhi et al. Feb 2011 A1
20110032830 Merwe et al. Feb 2011 A1
20110032843 Papp et al. Feb 2011 A1
20110075664 Lambeth et al. Mar 2011 A1
20110075674 Li et al. Mar 2011 A1
20110085557 Gnanasekaran et al. Apr 2011 A1
20110085559 Chung et al. Apr 2011 A1
20110103259 Aybay et al. May 2011 A1
20110119748 Edwards et al. May 2011 A1
20110134931 Merwe et al. Jun 2011 A1
20110141884 Olsson Jun 2011 A1
20110142053 Merwe et al. Jun 2011 A1
20110149964 Judge et al. Jun 2011 A1
20110149965 Judge et al. Jun 2011 A1
20110194567 Shen Aug 2011 A1
20110205931 Zhou et al. Aug 2011 A1
20110261825 Ichino Oct 2011 A1
20110283017 Alkhatib et al. Nov 2011 A1
20110299534 Koganti et al. Dec 2011 A1
20110310899 Alkhatib et al. Dec 2011 A1
20110317703 Dunbar et al. Dec 2011 A1
20120014386 Xiong et al. Jan 2012 A1
20120014387 Dunbar et al. Jan 2012 A1
20120131643 Cheriton May 2012 A1
20120151443 Rohde et al. Jun 2012 A1
20120155467 Appenzeller Jun 2012 A1
20120182992 Cowart et al. Jul 2012 A1
20120236734 Sampath et al. Sep 2012 A1
20130007740 Kikuchi et al. Jan 2013 A1
20130044636 Koponen et al. Feb 2013 A1
20130044641 Koponen et al. Feb 2013 A1
20130051399 Zhang et al. Feb 2013 A1
20130058225 Casado et al. Mar 2013 A1
20130058229 Casado et al. Mar 2013 A1
20130058335 Koponen et al. Mar 2013 A1
20130058346 Sridharan Mar 2013 A1
20130058350 Fulton Mar 2013 A1
20130058353 Koponen et al. Mar 2013 A1
20130060940 Koponen et al. Mar 2013 A1
20130070762 Adams et al. Mar 2013 A1
20130071116 Ong Mar 2013 A1
20130091254 Haddad et al. Apr 2013 A1
20130094350 Mandal et al. Apr 2013 A1
20130103817 Koponen et al. Apr 2013 A1
20130103818 Koponen et al. Apr 2013 A1
20130132536 Zhang et al. May 2013 A1
20130142048 Gross, IV et al. Jun 2013 A1
20130148541 Zhang et al. Jun 2013 A1
20130148542 Zhang et al. Jun 2013 A1
20130148543 Koponen et al. Jun 2013 A1
20130148656 Zhang et al. Jun 2013 A1
20130151661 Koponen et al. Jun 2013 A1
20130151676 Thakkar et al. Jun 2013 A1
20130208621 Manghirmalani et al. Aug 2013 A1
20130212148 Koponen et al. Aug 2013 A1
20130223444 Liljenstolpe et al. Aug 2013 A1
20130230047 Subrahmaniam et al. Sep 2013 A1
20130266007 Kumbhare et al. Oct 2013 A1
20130266015 Qu et al. Oct 2013 A1
20130266019 Qu Oct 2013 A1
20130268799 Mestery et al. Oct 2013 A1
20130329548 Nakil et al. Dec 2013 A1
20130332602 Nakil et al. Dec 2013 A1
20130332619 Xie et al. Dec 2013 A1
20130339544 Mithyantha Dec 2013 A1
20140003434 Assarpour et al. Jan 2014 A1
20140016501 Kamath et al. Jan 2014 A1
20140050091 Biswas et al. Feb 2014 A1
20140059226 Messerli et al. Feb 2014 A1
20140063364 Hirakata Mar 2014 A1
20140114998 Kadam et al. Apr 2014 A1
20140126418 Brendel et al. May 2014 A1
20140146817 Zhang May 2014 A1
20140149490 Luxenberg et al. May 2014 A1
20140173093 Rabeela et al. Jun 2014 A1
20140195666 Dumitriu et al. Jul 2014 A1
20140229945 Barkai et al. Aug 2014 A1
20140241247 Kempf et al. Aug 2014 A1
20140269299 Koomstra Sep 2014 A1
20140269702 Moreno Sep 2014 A1
20140328350 Hao et al. Nov 2014 A1
20140348166 Yang Nov 2014 A1
20140372582 Ghanwani et al. Dec 2014 A1
20140376550 Khan Dec 2014 A1
20150009831 Graf Jan 2015 A1
20150016300 Devireddy et al. Jan 2015 A1
20150055650 Bhat Feb 2015 A1
20150063360 Thakkar et al. Mar 2015 A1
20150063364 Thakkar et al. Mar 2015 A1
20150063366 Melander Mar 2015 A1
20150089082 Patwardhan et al. Mar 2015 A1
20150092594 Zhang et al. Apr 2015 A1
20150098475 Jayanarayana Apr 2015 A1
20150103838 Zhang et al. Apr 2015 A1
20150124586 Pani May 2015 A1
20150124810 Hao et al. May 2015 A1
20150172156 Lohiya Jun 2015 A1
20150188770 Naiksatam et al. Jul 2015 A1
20150222550 Anand Aug 2015 A1
20150263897 Ganichev et al. Sep 2015 A1
20150263946 Tubaltsev et al. Sep 2015 A1
20150263952 Ganichev et al. Sep 2015 A1
20150271011 Neginhal et al. Sep 2015 A1
20150271303 Neginhal et al. Sep 2015 A1
20150281067 Wu Oct 2015 A1
20150299880 Jorge et al. Oct 2015 A1
20150372869 Rao Dec 2015 A1
20160105471 Nunes et al. Apr 2016 A1
20160119229 Zhou Apr 2016 A1
20160134513 Yang May 2016 A1
20160149808 Cai May 2016 A1
20160174193 Zhang Jun 2016 A1
20160182287 Chiba et al. Jun 2016 A1
20160191374 Singh et al. Jun 2016 A1
20160226700 Zhang et al. Aug 2016 A1
20160226754 Zhang et al. Aug 2016 A1
20160226762 Zhang et al. Aug 2016 A1
20160261493 Li Sep 2016 A1
20160294612 Ravinoothala et al. Oct 2016 A1
20160330120 Thyamagundalu Nov 2016 A1
20160344586 Ganichev et al. Nov 2016 A1
20160352633 Kapadia Dec 2016 A1
20170005923 Babakian Jan 2017 A1
20170034051 Chanda Feb 2017 A1
20170034052 Chanda Feb 2017 A1
20170048129 Masurekar et al. Feb 2017 A1
20170048130 Goliya et al. Feb 2017 A1
20170063632 Goliya et al. Mar 2017 A1
20170063633 Goliya et al. Mar 2017 A1
20170064717 Filsfils et al. Mar 2017 A1
20170070425 Mithyantha Mar 2017 A1
20170085502 Biruduraju Mar 2017 A1
20170126497 Dubey et al. May 2017 A1
20170180154 Duong et al. Jun 2017 A1
20170207992 Huang Jul 2017 A1
20170230241 Neginhal et al. Aug 2017 A1
20170288981 Hong et al. Oct 2017 A1
20170317919 Fernando et al. Nov 2017 A1
20180006943 Dubey Jan 2018 A1
20180062914 Boutros et al. Mar 2018 A1
20180097734 Boutros et al. Apr 2018 A1
20180159821 Chanda et al. Jun 2018 A1
20180367442 Goliya et al. Dec 2018 A1
20190018701 Dubey et al. Jan 2019 A1
20190020580 Boutros et al. Jan 2019 A1
20190020600 Zhang et al. Jan 2019 A1
20190109780 Nagarkar Apr 2019 A1
20190124004 Dubey Apr 2019 A1
20190190885 Krug et al. Jun 2019 A1
20190199625 Masurekar et al. Jun 2019 A1
20190245783 Mithyantha Aug 2019 A1
20190281133 Tomkins Sep 2019 A1
20190312812 Boutros et al. Oct 2019 A1
20190334767 Neginhal et al. Oct 2019 A1
20190372895 Parthasarathy et al. Dec 2019 A1
20200169496 Goliya et al. May 2020 A1
20200186468 Basavaraj et al. Jun 2020 A1
20200195607 Wang et al. Jun 2020 A1
20200220802 Goliya et al. Jul 2020 A1
20200267095 Zhang et al. Aug 2020 A1
20200366606 Dubey Nov 2020 A1
20210019174 Dubey et al. Jan 2021 A1
20210029028 Masurekar et al. Jan 2021 A1
Foreign Referenced Citations (48)
Number Date Country
1301096 Jun 2001 CN
1442987 Sep 2003 CN
1714548 Dec 2005 CN
101005452 Jul 2007 CN
102726007 Oct 2012 CN
102780605 Nov 2012 CN
102986172 Mar 2013 CN
103546381 Jan 2014 CN
103595648 Feb 2014 CN
103890751 Jun 2014 CN
103917967 Jul 2014 CN
103947164 Jul 2014 CN
104009929 Aug 2014 CN
104335553 Feb 2015 CN
102461098 Sep 2015 CN
105556907 May 2016 CN
105791412 Jul 2016 CN
105791463 Jul 2016 CN
1653688 May 2006 EP
2672668 Dec 2013 EP
2838244 Feb 2015 EP
3013006 Apr 2016 EP
3507950 Jul 2019 EP
2000244567 Sep 2000 JP
2003069609 Mar 2003 JP
2003124976 Apr 2003 JP
2003318949 Nov 2003 JP
2004134967 Apr 2004 JP
2004193878 Jul 2004 JP
2011139299 Jul 2011 JP
2011228864 Nov 2011 JP
2013157855 Aug 2013 JP
2014531831 Nov 2014 JP
2014534789 Dec 2014 JP
1020110099579 Sep 2011 KR
2005112390 Nov 2005 WO
2008095010 Aug 2008 WO
2013020126 Feb 2013 WO
2013026049 Feb 2013 WO
2013055697 Apr 2013 WO
2013081962 Jun 2013 WO
2013143611 Oct 2013 WO
2013184846 Dec 2013 WO
2015015787 Feb 2015 WO
2015142404 Sep 2015 WO
2016123550 Aug 2016 WO
2017027073 Feb 2017 WO
2018044746 Mar 2018 WO
Non-Patent Literature Citations (31)
Entry
Agarwal, Sugam, et al., “Traffic Engineering in Software Defined Networks,” 2013 Proceedings IEEE INFOCOM, Apr. 14, 2013, 10 pages, Bell Labs, Alcatel-Lucent, Holmdel, NJ, USA.
Aggarwal, R., et al., “Data Center Mobility based on E-VPN, BGP/MPLS IP VPN, IP Routing and NHRP,” draft-raggarwa-data-center-mobility-05.txt, Jun. 10, 2013, 24 pages, Internet Engineering Task Force, IETF, Geneva, Switzerland.
Author Unknown, “Cisco Border Gateway Protocol Control Plane for Virtual Extensible LAN,” White Paper, Jan. 23, 2015, 6 pages, Cisco Systems, Inc.
Author Unknown, “Cisco Data Center Spine-and-Leaf Architecture: Design Overview,” White Paper, Apr. 15, 2016, 27 pages, Cisco Systems, Inc.
Author Unknown, “VMware® NSX Network Virtualization Design Guide,” Month Unknown 2013, 32 pages, Item No. VMW-NSX-NTWK-VIRT-DESN-GUIDE-V2-101, VMware, Inc., Palo Alto, CA, USA.
Ballani, Hitesh, et al., “Making Routers Last Longer with ViAggre,” NSDI '09: 6th USENIX Symposium on Networked Systems Design and Implementation, Apr. 2009, 14 pages, USENIX Association.
Caesar, Matthew, et al., “Design and Implementation of a Routing Control Platform,” NSDI '05: 2nd Symposium on Networked Systems Design & Implementation , Apr. 2005, 14 pages, Usenix Association.
Dobrescu, Mihai, et al., “RouteBricks: Exploiting Parallelism to Scale Software Routers,” SOSP'09, Proceedings of the ACM SIGOPS 22nd Symposium on Operating Systems Principles, Oct. 2009, 17 pages, ACM, New York, NY.
Dumitriu, Dan Mihai, et al., (U.S. Appl. No. 61/514,990), filed Aug. 4, 2011, 31 pages.
Fernando, Rex, et al., “Service Chaining using Virtual Networks with BGP,” Internet Engineering Task Force, IETF, Jul. 7, 2015, 32 pages, Internet Society (ISOC), Geneva, Switzerland, available at https://tools.ietf.org/html/draft-fm-bess-service-chaining-01.
Handley, Mark, et al., “Designing Extensible IP Router Software,” Proc. of NSDI, May 2005, 14 pages.
Kim, Changhoon, et al., “Revisiting Route Caching: The World Should be Flat,” in Proc. of International Conference on Passive and Active Network Measurement, Apr. 2009, 10 pages, Springer, Berlin, Heidelberg.
Koponen, Teemu, et al., “Network Virtualization in Multi-tenant Datacenters,” Technical Report TR-2013-001E, Aug. 2013, 22 pages, VMware, Inc., Palo Alto, CA, USA.
Lakshminarayanan, Karthik, et al., “Routing as a Service,” Report No. UCB/CSD-04-1327, Month Unknown 2004, 16 pages, Computer Science Division (EECS), University of California—Berkeley, Berkeley, California.
Lowe, Scott, “Learning NSX, Part 14: Using Logical Routing,” Scott's Weblog: The weblog of an IT pro specializing in cloud computing, virtualization, and networking, all with an open source view, Jun. 20, 2014, 8 pages, available at https://blog.scottlowe.org/2014/06/20/learning-nsx-part-14-using-logical-routing/.
Maltz, David A., et al., “Routing Design in Operational Networks: A Look from the Inside,” SIGCOMM '04, Aug. 30-Sep. 3, 2004, 14 pages, ACM, Portland, Oregon, USA.
Moreno, Victor, “VXLAN Deployment Models—A Practical Perspective,” Cisco Live 2015 Melbourne, Mar. 6, 2015, 72 pages, BRKDCT-2404, Cisco Systems, Inc.
Non-Published commonly Owned U.S. Appl. No. 16/506,782, filed Jul. 9, 2019, 91 pages, Nicira, Inc.
Pelissier, Joe, “Network Interface Virtualization Review,” Jan. 2009, 38 pages.
Rosen, E., “Applicability Statement for BGP/MPLS IP Virtual Private Networks (VPNs),” RFC 4365, Feb. 2006, 32 pages, The Internet Society.
Shenker, Scott, et al., “The Future of Networking, and the Past of Protocols,” Dec. 2, 2011, 30 pages, USA.
Wang, Anjing, et al., “Network Virtualization: Technologies, Perspectives, and Frontiers,” Journal of Lightwave Technology, Feb. 15, 2013, 15 pages, IEEE.
Wang, Yi, et al., “Virtual Routers on the Move: Live Router Migration as a Network-Management Primitive,” SIGCOMM '08, Aug. 17-22, 2008, 12 pages, ACM, Seattle, Washington, USA.
Mon-published commonly owned U.S. Appl. No. 16/823,050, filed Mar. 18, 2020, 79 pages, Nicira, Inc.
Non-published commonly owned U.S. Appl. No. 16/945,910, filed Aug. 2, 2020, 46 pages, Nicira, Inc.
Non-published commonly owned U.S. Appl. No. 17/062,531, filed Oct. 2, 2020, 75 pages, Nicira, Inc.
Non-published commonly owned U.S. Appl. No. 17/068,588, filed Oct. 12, 2020, 75 pages, Nicira, Inc.
PCT International Search Report and Written Opinion dated Nov. 6, 2017 for commonly owned International Patent Application PCT/US2017/048787, 11 pages, International Searching Authority (EPO).
Non-Published Commonly Owned Related U.S. Appl. No. 17/579,513, filed Jan. 19, 2022, 91 pages, Nicira, Inc.
Non-Published Commonly Owned Related U.S. Appl. No. 17/580,596, filed Jan. 20, 2022, 103 pages, Nicira, Inc.
Xu, Ming-Wei, et al., “Survey on Distributed Control in a Router,” Acta Electronica Sinica, Aug. 2010, 8 pages, vol. 38, No. 8, retrieved from https://www.ejournal.org.cn/EN/abstract/abstract216.shtml.
Related Publications (1)
Number Date Country
20200021483 A1 Jan 2020 US
Provisional Applications (1)
Number Date Country
62382229 Aug 2016 US
Continuations (1)
Number Date Country
Parent 15443974 Feb 2017 US
Child 16581118 US