The invention is based on a priority application EP 11 005 588.6 which is hereby incorporated by reference.
The present invention relates to a method of transmitting Ethernet packets between two or more Ethernet LANs through an interconnecting IP network, a centralised server and a customer edge device (LAN=Local Area Network; IP=Internet Protocol).
Cloud computing services are typically hosted in data centers that are internally realized by large Ethernet networks. There is a certain trend to decentralize these data centers, i.e. to host services in a larger number of smaller, geographically distributed data centers.
Each data center site LAN1, LAN2, LAN3 is connected to the interconnecting network N by a customer edge device CE. Each data center LAN1, LAN2, LAN3 comprises server farms 30 which are connected via switches SW to the customer edge device CE of the respective data center site LAN1, LAN2, LAN3. The interconnecting network N, which may be a transport network based on IP/MPLS, comprises three interconnected provider edges PE, one for each customer edge device CE. The connection of a customer edge device CE with its associated provider edge PE may be via a user network interface UNI. A connection of a first provider edge PE and a second provider edge PE may be via a network-to-network interface NNI. For simplicity,
There are many technologies that can interconnect the Ethernet networks LAN1, LAN2, LAN3 over layer 1, layer 2, or layer 3 links. Their common objective is to transparently interconnect all Ethernet networks LAN1, LAN2, LAN3. The customer edge devices transport, i.e. tunnel, the Ethernet traffic over the WAN in a multi-point to multi-point way. By tunneling Ethernet or IP transparently over the WAN, the WAN is invisible for the nodes in each data center. From the perspective of the data center, the customer edge device is similar to a standard Ethernet switch/bridge, obviously apart from the larger delay in the WAN.
It is the object of the present invention to provide an improved solution for an interconnection of distributed Ethernet LANs over an IP network.
An object of the present invention is achieved by a method of transmitting Ethernet packets between two or more Ethernet LANs through an interconnecting IP network, each of the Ethernet LANs being connected to the interconnecting IP network by means of one or more respective customer edge devices, wherein an exchange between the customer edge devices of control information associated with the Ethernet packet transmission is processed and controlled by a centralised server connected to each of the customer edge devices via a control connection. A further object of the present invention is achieved by a centralised server of an overlay network with two or more Ethernet LANs and an interconnecting IP network, the centralised server comprising two or more interfaces for connecting the centralised server via control connections to respective customer edge devices, each of the customer edge devices connecting one or more associated Ethernet LANs to the interconnecting IP network, whereby the centralised server is adapted to process and control a control information exchange between the customer edge devices, the exchanged control information being associated with a transmission of Ethernet packets between two or more of the two or more Ethernet LANs through the interconnecting IP network. And a further object of the present invention is achieved by a customer edge device associated with one or more Ethernet LANs, the customer edge device comprising at least one Ethernet interface to the Ethernet LAN, at least one data traffic interface to an interconnecting IP network interconnecting the Ethernet LAN with at least one further Ethernet LAN for a transmission of Ethernet packets between the Ethernet LAN and the at least one further Ethernet LAN via the interconnecting IP network, and a control information interface to a centralised server for exchange of control information associated with the Ethernet packet transmission via a control connection wherein the control information exchanged between the customer edge device and respective customer edge devices of the at least one further Ethernet LAN is sent to and received from the centralised server through the control information interface.
The two or more Ethernet LANs and the interconnecting IP network form an overlay network. The invention realises an overlay system that transparently interconnects Ethernet networks over an IP network, i.e. an Ethernet-over-IP solution that is optimized for data centers. In this description the terms “data center”, “data center site” and “site” are used synonymously with the term “Ethernet LAN”.
The invention provides a simple and scalable solution that neither requires static IP tunnels nor explicit path management, e.g. MPLS label switched paths.
The invention provides a centralised server, i.e. a single point to which the Ethernet-over-IP system can peer. Therefore, unlike in known approaches which use a distributed control plane, embodiments of the invention make it possible to apply global policies and to link the data center interconnect solution with control and management systems, either a network management, or a cloud computing management, e.g. a cloud orchestration layer.
The use of a centralized server is supported by research results that show that commercial-of-the-shelf personal computer technology is able to process of the order of 100,000 signalling messages per second between a centralized controller and several network devices, over TCP connections. There is a certain similarity to the OpenFlow technology, which also use one centralized server, which is called controller. The expected order of magnitude of control traffic in the proposed system is much smaller, i.e., a centralized server is sufficiently scalable. The centralised server is logically a centralized entity, but may of course be realized in a distributed way, e.g., to improve the resilience. Distributed realisations of the centralised server may also use load balancing.
The invention provides an advantageous alternative or complement to the standardized, multi-vendor solution known as Virtual Private Local Area Network Service (=VPLS), if only IP connectivity is available. VPLS is based on MPLS. While VPLS is an appropriate solution whenever an MPLS link to each data center site is available, this requirement will not necessarily be fulfilled if a larger number of small data centers are used for cloud computing offers, or, e.g., distributed Content Delivery Network (=CDN) caches. In that case, at least a subset of sites may only be connected via IP links, or the public Internet. This implies that a pure MPLS-based solution may not be sufficient. This gap is covered by the present invention.
Furthermore, the setup of a full mesh of MPLS paths is complex and limits the dynamics of the data center interconnection solution. Tunneling of MPLS over IP would result in additional overhead. The invention provides an improved solution which avoids the aforementioned disadvantages.
The invention proposes a new technology to interconnect Ethernet networks over an IP network, using a centralized server in combination with overlay network mechanisms.
One of the main benefits of the invention is its simplicity. The invention neither requires a complex setup of tunnels nor specific support by an interconnecting network. The invention makes it possible to interconnect data center Ethernet networks over any IP network, even without involvement of the network provider. Also, the use of a centralized server with a potentially global view on the Ethernet network simplifies the enforcement of policies and intelligent traffic distribution mechanisms.
The service provided by the invention differs from other VPN solutions (VPN=Virtual Private Network). Unlike IPsec VPNs, this invention does not focus on encryption and thereby avoids the complexity of setting up the corresponding security associations (IPsec=Internet Protocol Security). Still, the invention can be natively implemented on top of IPsec. The invention also differs from tunneling solutions such as L2TP/L2TPv3 and PPTP, as it is a soft-state solution only with no explicit tunnel setup (L2TP=Layer 2 Tunneling Protocol; PPTP=Point-to-Point Tunneling Protocol). This results in less configuration overhead and the ability to scale to a large number of data center sites.
The invention does not use IP multicast or extended routing protocols, but a centralized server instead, which is simpler and enables centralized control and management. Most notably, the invention does not use extensions of the IS-IS routing protocol, operates on a per-destination-address basis, not on a per-flow basis, provides additional overlay topology management functions, and scales to large networks.
The invention relies on a centralized server instead of proprietary routing protocol extensions. A centralized server is simpler to implement, deploy, and operate than an overlay that requires several IP multicast groups. It can also very easily be coupled with other control and management systems, e.g., for the dynamic configuration of policies.
Compared to the existing data center interconnect solutions that use static tunnels or label switched paths, e.g. VPLS, the invention is much simpler to configure and implement, as the edge devices only require a minimum initial configuration and only maintain soft state for the traffic in the overlay. As in the framework of the invention it is easy to add and remove sites from the overlay, Ethernet interconnectivity can be offered even for a large number of highly distributed data center sites that are turned on and off frequently.
Further advantages are achieved by embodiments of the invention indicated by the dependent claims.
According to an embodiment of the invention, the control information is related to one or more of: mapping of Ethernet addresses of network devices of Ethernet LANs to IP addresses of customer edge devices, information concerning a scope of Ethernet LANs and/or VLAN tags, Address Resolution Protocol (ARP) information, membership information of multicast groups inside the Ethernet LANs, filtering policies, firewall rules, overlay topology, information about path characteristics between customer edge devices, bootstrapping and configuration information for devices joining an overlay network comprising the two or more Ethernet LANs.
Instead of transporting control information inside a routing protocol between the customer edge devices, the inventive method uses a centralized server. Each customer edge device is connected to the centralised server by a control connection, preferably a TCP connection, and exchanges control information (TCP=Transmission Control Protocol). Specifically, this control connection transports
The customer edge devices report information to the centralised server, which distributes the information then to the other customer edge devices, and preferably also maintains a global view of the whole data center network and the attachment of Ethernet devices in the different Ethernet segments. The control connections can also be encrypted, e.g. using the Transport Layer Security (=TLS), in order to protect the data integrity and preferably to enable an authentication and authorization of customer edge devices joining the overlay.
According to another embodiment of the invention, the method further comprises the steps of reporting, by one or more of the customer edge devices, control information to the centralised server; managing, by the centralised server, the received control information and distributing processed control information to one or more of the customer edge devices including a first customer edge device associated with a first Ethernet LAN of the two or more Ethernet LANs; and using, by the first customer edge device, the received control information for controlling a transmission of Ethernet data traffic from a first network device of the first Ethernet LAN through the interconnecting IP network to a second network device of a second Ethernet LAN of the two or more Ethernet LANs.
According to another embodiment of the invention, the method further comprises the steps of sending, by a first network device of a first Ethernet LAN of the two or more Ethernet LANs, an Ethernet packet destined for an Ethernet address of a second network device of a second Ethernet LAN of the two or more Ethernet LANs; receiving, by a first customer edge device associated with the first Ethernet LAN, the Ethernet packet and checking if a forwarding table managed by the first customer edge device contains a mapping of the Ethernet address of the second network device to an IP address of a customer edge device associated with the second Ethernet LAN; if the forwarding table does not contain the said mapping, sending by the first customer edge device an address resolution request to the centralised server and receiving from the centralised server in response to the address resolution request a reply message specifying the said mapping; encapsulating, by the first customer edge device, the Ethernet packet with an encapsulation header inside an IP packet comprising a destination address of the second customer edge device according to the mapping; sending the encapsulated Ethernet packet via the interconnecting IP network to the second customer edge device; and decapsulating, by the second customer edge device, the received Ethernet packet for delivery within the second Ethernet LAN to the second network device. The customer edge devices should drop packets with destination Ethernet addresses that cannot be resolved.
The encapsulation header at least comprises an IP header. In addition, further shim layers may be used for encapsulation, most notably the User Datagram protocol (UDP) or the Generic Routing Encapsulation (GRE), or both.
The customer edge devices tunnel Ethernet packets over the IP network by encapsulating them into IP packets, e.g. UDP packets, without requiring the explicit setup of tunnels (UDP=User Datagram Protocol). The IP addresses of the destination customer edge device are learned from the centralised server if they are not already locally known. Ethernet packets are then transported over the IP network to the destination customer edge devices, decapsulated there, and finally delivered to the destination Ethernet device inside the destination data center LAN.
A UDP encapsulation of data plane packets and a TCP-based control connection to the centralised server works in environments where other protocols, such as IP multicast or routing protocols, are blocked. Other benefits of the invented architecture include:
In an embodiment, the method further comprises the steps of intercepting, by a first customer edge device associated with a first Ethernet LAN of the two or more Ethernet LANs, an Address Resolution Request (ARP) sent by a first network device of the first Ethernet LAN, if the first network device intends to resolve an IP address of a second network device located in a second Ethernet LAN to the corresponding Ethernet address, blocking the request if the address mapping of the IP address of the second network device to the Ethernet address of the second device is not known, and sending a corresponding lookup request from the first customer edge device to the centralised server; after receipt of the lookup request, forwarding by the centralised server the lookup request to all other customer edge devices except the first customer edge device; after receipt of the lookup request, distributing by the other customer edge devices, the lookup request among the network devices of the respective Ethernet LANs; receiving, by the other customer edge devices, lookup replies from the network devices of the respective Ethernet LANs and forwarding the lookup replies to the centralised server; managing and processing the received lookup replies by the centralised server and sending a lookup reply to the first customer edge device which had initiated the lookup request; and sending, by the first customer edge device, the lookup reply to the first network device which had initiated the address resolution request.
According to another embodiment of the invention, the method further comprises the steps of announcing, by the centralised server, the lookup reply which is sent by the centralised server to the first customer edge device also to the other customer edge devices for their learning of addresses from the centralised server, i.e. so that they learn the addresses from the centralised server and can store them in an ARP table or in the forwarding table in the customer edge device, similar to an ARP proxy.
According to another embodiment of the invention, the method further comprises the steps of measuring, by at least one of the customer edge devices, path characteristics and sending the measured path characteristics to the centralised server; establishing, by the centralised server, topology characteristics regarding the communication between the two or more Ethernet LANs on the basis of the received path characteristics; announcing, by the centralised server, the established topology characteristics to the customer edge devices; and making use of this information in routing decisions by at least one of the customer edge devices.
According to another embodiment of the invention, in a case where the interconnecting IP network connects at least three Ethernet LANs, the method further comprises the steps of routing, on account of announced topology characteristics, an ongoing communication between a first and a second Ethernet LAN of the at least three Ethernet LANs via a third customer edge device of a third Ethernet LAN of the at least three Ethernet LANs.
Using the topology information established by the centralised server, customer edge devices can also use more sophisticated forwarding and traffic engineering mechanisms. Specifically, embodiments of the invention allow a multi-hop forwarding in the overlay to move traffic away from congested links between two data center sites. In practice, two hops will be sufficient in most cases. The invention does not use IP multicast. Instead any multicast or broadcast traffic is duplicated in the customer edge devices and forwarded point-to-point in UDP datagrams to each customer edge device. This design, which is similar to the handling of such packets in VPLS, avoids problems in networks not supporting IP multicast.
Most notably, the use of multi-hop forwarding allows bypassing a potentially congested link between two data center sites, if there is an alternative path. The global view of the network at the centralised server, as well as the distribution of path characteristic measurements to the customer edge devices enable a better load balancing and intelligent routing, also if sites are multi-homed. If there is an alternative uncongested path in the overlay, as shown in
According to another embodiment of the invention, the centralised server further comprises a data base containing at least one mapping of an Ethernet address of a network device of one of the Ethernet LANs to an IP address of a customer edge device of the respective Ethernet LAN with which the network device is associated.
According to another embodiment of the invention, the database of the centralised server further contains at least one address mapping of an Ethernet address of a network device of one of the Ethernet LANs to its corresponding IP address, so that the centralized server can answer Ethernet address lookup queries without Address Resolution Protocol broadcasts.
According to another embodiment of the invention, the centralised server further comprises an interface to a network or cloud computing management system that provides for instance policies or monitors the overlay.
According to another embodiment of the invention, the customer edge device further comprises a forwarding table containing at least one mapping of an Ethernet address of a network device of one of the at least one further Ethernet LAN to an IP address of the respective customer edge device of the at least one further Ethernet LAN with which the network device is associated.
According to another embodiment of the invention, the customer edge device further comprises a path metering unit adapted to measure path characteristics and that the customer edge device is adapted to send the measured path characteristics to the centralised server.
According to another embodiment of the invention, the customer edge device further comprises an address resolution proxy adapted to analyze an Address Resolution Request (ARP) sent by a network device of the Ethernet LAN in order to receive information related to the address mapping of IP and Ethernet addresses of a destination network device addressed in the ARP request. If the address mapping is not known yet by the customer edge device, the request is blocked and a corresponding lookup request is sent to the centralised server over the control connection. If the address mapping is already known from the ARP table in the customer edge device, a corresponding ARP reply is sent back to the network device. In both cases, the transport of the ARP messages over the overlay can be avoided.
According to a preferred embodiment, the address resolution proxy learns address mappings of the IP and Ethernet addresses of the destination network device from the centralised server and directly replies to the intercepted Address Resolution Protocol request from the network device if the address mapping is already known. The address resolution proxy may also learn address mappings by other means, for instance by monitoring of ongoing traffic or additional ARP lookups.
These as well as further features and advantages of the invention will be better appreciated by reading the following detailed description of exemplary embodiments taken in conjunction with accompanying drawings of which:
A key component of the overlay network is a centralized server 10 that handles the exchange of control plane messages associated with a transmission of Ethernet packets between Ethernet LAN through the interconnecting network in an Ethernet-over-IP transmission mode. Therefore, unlike in prior art, no modifications of routing protocols etc. are required. The invention only requires some additional functionality in the customer edge devices CE1, CE2, CE3, as detailed below. The centralised server 10 can either be a stand-alone device, e.g. a high-performance personal computer, or it can be integrated in one of the customer edge devices, as indicated by the dotted outline of a box in
For all Ethernet addresses that are known to be located in other sites, the Ethernet packets are encapsulated into an IP encapsulation packet, e.g. an UDP packet, using an additional header, and then sent via IP to the IP address of the customer edge device at the destination Ethernet LAN. This data plane operation is similar to other tunnel solutions.
Further, the centralised server 10 announces 48 the lookup reply 47 which is sent by the centralised server 10 to the first customer edge device CE1 also to the third customer edge device CE3 for its learning of addresses from the centralised server 10. By storing this information in an ARP table, the other customer edge devices can in future answer address lookup queries and encapsulate and forward packets to those destinations without interacting with the server.
A customer edge device CE1, CE2, CE3 only forwards an Ethernet packet to the overlay if the destination address is known. The customer edge devices CE1, CE2, CE3 learn addresses from the centralized server 10. The learning from the centralized server 10 is one of the key differentiators compared to prior art systems. The invention does not need established multicast trees or routing protocol extensions. The address learning is handled as follows:
A first data connection 50AB is established from a first network device A of a first Ethernet LAN LAN1 of the two or more Ethernet LANs LAN1, LAN2, LAN3 to a second network device B of a second Ethernet LAN LAN2 of the two or more Ethernet LANs LAN1, LAN2, LAN3. A second data connection 50AC is established from the first network device A to a third network device C of the second Ethernet LAN LAN2. A third data connection 50AD is established from the first network device A to a fourth network device D of a third Ethernet LAN LAN3 of the two or more Ethernet LANs LAN1, LAN2, LAN3.
Path metering units 26 of the customer edge devices CE1, CE2, CE3 measure 51 path characteristics of the data transmission paths 50AB, 50AC, 50AD from all known other customer edge devices CE1, CE2, CE3, e.g. by measuring packet loss, optionally also packet delay, and send 52 the measured path characteristics to the centralised server 10, e.g. in the form of a path characteristics report. The centralised server 10 establishes topology characteristics regarding the data transmission, i.e. communication, between the three Ethernet LANs LAN1, LAN2, LAN3 on the basis of the received path characteristics. The centralised server 10 announces 53 the established topology characteristics to the customer edge devices CE1, CE2, CE3. At least one of the customer edge devices CE1, CE2, CE3 makes use of this information in subsequent routing decisions.
The method uses the centralised server 10 to distribute delay and load information for all paths 50AB, 50AC, 50AD, in order to enable optimized overlay routing as described below. This measurement uses the following techniques:
Of three ongoing data transmission paths 60AB, 60AC, 60AD, two paths 60AB, 60AC suffer from a congestion 61 in the interconnecting network N, namely a first path 60AB between the network device A in a first Ethernet LAN LAN1 and a second network device B of a second Ethernet LAN LAN2, and a second path 60AC between the network device A in the first Ethernet LAN LAN1 and a third network device C of the second Ethernet LAN LAN2. From path measurements, e.g. from ICMP pings, by means of a path metering unit 26 of the customer edge device CE1 connecting the first Ethernet LAN LAN1 to the interconnecting network N, the customer edge device CE1 notices 62 a loss and/or delay of Ethernet packets transmitted on these congested paths 60AB, 60AC. Alternatively, the problem could also be noticed by CE2. Triggered by a corresponding control message reporting the congestion sent via the control connection from the customer edge device CE1 to the centralised server 10, the centralised server 10, based on its established topology characteristics of the overlay network, announces 63 that the third ongoing data transmission path 60AD between the network device A in a path between a third customer edge device CE3 of a third Ethernet LAN LAN3 and a second customer edge device CE2 of the second Ethernet LAN LAN2 is not congested.
Consequently the first customer edge device CE1 of the first Ethernet LAN LAN1 sends 64 at least a part of the data traffic from the congested data transmission paths 60AB, 60AC, namely the data traffic from the congested data transmission path 60AB, to the third customer edge device CE3. Subsequently, the third customer edge device CE3 forwards 65 the packet to the destination address of the final destination to the second customer edge device CE2. This can be achieved by decapsulating the received Ethernet packets and encapsulating them again with the new destination address. This way the data traffic between the network devices A and B is re-routed 66 via the second customer edge device CE2.
Embodiments of the invention achieve an overlay multi-hop routing. Such overlay routing is not considered by prior art data center interconnect solutions. Multi-hop routing in the overlay between the sites can work around congestion or suboptimal IP routing on the direct path, if there are more than two sites attached to the overlay. This preferably triangular re-routing can result in a larger delay, but still may be beneficial to improve the overall throughput. Yet, a fundamental challenge is loop prevention. The overlay routing in ECO is realized as follows:
Number | Date | Country | Kind |
---|---|---|---|
11005588.6 | Jul 2011 | EP | regional |
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/EP2012/062126 | 6/22/2012 | WO | 00 | 12/20/2013 |