1. Field
The present disclosure relates to network management. More specifically, the present disclosure relates to a method and system for remote load balancing in high-availability networks.
2. Related Art
Currently, end stations in layer-2 networks have not been able to take advantage of the routing functionalities available in such networks. End stations can typically only operate as leaf nodes and are often constrained to an interface with only one of the routing nodes. Even when an end station is interfaced with two or more routing nodes, other routing nodes in the network can send data to that end station only via one routing node to which the end station is connected.
Meanwhile, layer-2 networking technologies continue to evolve. More routing functionalities, which have traditionally been the characteristics of layer-3 (e.g., IP) networks, are migrating to layer-2. Notably, the recent development of the Transparent Interconnection of Lots of Links (TRILL) protocol allows Ethernet switches to function more like routing nodes. TRILL overcomes the inherent inefficiency of the conventional spanning tree protocol, which forces layer-2 switches to be coupled in a logical spanning-tree topology to avoid looping. TRILL allows routing bridges (RBridges) to be coupled in an arbitrary topology without the risk of looping by implementing routing functions in switches and including a hop count in the TRILL header.
However, there is currently no support of remote load balancing on data paths leading to a destination device coupled to at least two separate egress switching devices in a TRILL network.
One embodiment of the present invention provides a system for facilitating remote load balancing in a high-availability network. During operation, the system receives a plurality of data frames destined for a destination device, wherein the destination device is coupled to a network via a trunk link, the trunk link coupling the destination device to at least two separate egress switching devices. The system then forwards the data frames via at least two data paths, each of which leads to a respective egress switching device.
In a variation on this embodiment, the system forwards a data frame via a respective data path by placing a respective egress switching device's identifier in the header of the frame.
In a variation on this embodiment, the switching devices are routing bridges capable of routing data frames without requiring the network topology to be a spanning tree topology.
In a variation on this embodiment, the trunk link is associated with a virtual identifier.
In a further variation, the virtual identifier is a virtual routing bridge identifier based on the TRILL protocol.
In a variation on this embodiment, the system selects a respective data path based on a hash value computed on at least one field in the data frame header, thereby achieving load balancing among the different data paths.
In a variation on this embodiment, the system selects a respective data path based on a predetermined load distribution, thereby achieving load balancing among the different switched paths.
In a variation on this embodiment, the system selects next-hop switching devices corresponding to different data paths for forwarding the data frames, thereby achieving load balancing among the different data paths.
In a variation on this embodiment, in response to detecting a failure of a link between the destination device and an egress switching device, the system advertises non-reachability to that egress switching device.
The following description is presented to enable any person skilled in the art to make and use the invention, and is provided in the context of a particular application and its requirements. Various modifications to the disclosed embodiments will be readily apparent to those skilled in the art, and the general principles defined herein may be applied to other embodiments and applications without departing from the spirit and scope of the present invention. Thus, the present invention is not limited to the embodiments shown, but is to be accorded the widest scope consistent with the claims.
In embodiments of the present invention, the problem of remote load balancing on data paths leading to a destination host which is coupled to at least two separate egress RBridges in a TRILL network is solved by replacing the destination's virtual RBridge ID with a respective egress RBridge ID in the header of the data frame. The data frames are thus forwarded to the destination host via at least two data paths, each of which leads to a respective egress RBridge.
For example, in a layer-2 network running the TRILL protocol, when a host is coupled to one or more routing bridges (RBridges), a virtual TRILL RBridge identifier is assigned to this host. The host is then considered to be a virtual RBridge capable of running the TRILL protocol. The assignment of a virtual RBridge identifier allows a non-TRILL-capable host to participate in the routing domain of a TRILL network, and to be coupled to multiple RBridges in an arbitrary topology. Such a configuration provides tremendous flexibility and facilitates high availability in case of both link and node failures. For instance, an end station with a virtual RBridge identifier can be coupled to two or more physical RBridges using link aggregation. The physical RBridges can advertise connectivity to the virtual RBridge to their neighbor RBridges. Consequently, other RBridges in the TRILL network can reach this host through multiple data paths by specifying any respective physical RBridge IDs coupled to the virtual RBridge as egress points. Moreover, when one of the aggregated links fails, the affected end station can continue operating via the remaining link(s). For the rest of the TRILL network, the host with a virtual RBridge ID remains reachable.
Although this disclosure is presented using examples based on the TRILL protocol, embodiments of the present invention are not limited to TRILL networks, or networks defined in a particular Open System Interconnection Reference Model (OSI reference model) layer. In particular, although the term “layer-2” is mentioned several times in the examples, embodiments of the present invention are not limited to application to layer-2 networks. Other networking environments, either defined in OSI layers or other layering models, or not defined with any layering model, can also use the disclosed embodiments. For instance, these embodiments can apply to Multiprotocol Label Switching (MPLS) networks as well as Storage Area Networks (e.g., Fibre Channel networks).
Furthermore, although intermediate-system-to-intermediate-system (IS-IS) routing protocol is used in the TRILL examples, embodiments of the present invention are not limited to a particular routing protocol. Other routing protocols, such as Open Shortest Path First (OSPF), Routing Information Protocol (RIP), Interior Gateway Routing Protocol (IGRP), Enhanced IGRP (EIGRP), Border Gateway Protocol (BGP), or other open or proprietary protocols can also be used. In addition, embodiments of the present invention are not limited to the TRILL frame encapsulation format. Other open or proprietary encapsulation formats and methods can also be used.
The term “RBridge” refers to routing bridges, which are bridges implementing the TRILL protocol as described in IETF draft “RBridges: Base Protocol Specification,” available at http://tools.ietf.org/html/draft-ietf-trill-rbridge-protocol-14, which is incorporated by reference herein. Embodiments of the present invention are not limited to application among RBridges. Other types of switches, routers, and forwarders can also be used.
The term “physical RBridge” refers to an RBridge running TRILL protocol, as opposed to a “virtual RBridge,” which refers to a non-TRILL end station with a virtual RBridge ID.
The term “virtual RBridge” refers to a non-TRILL end station with a virtual RBridge ID. The physical RBridge(s) to which the non-TRILL end station is coupled can advertise the connectivity to this end station as if it were a regular RBridge.
The term “multi-homed host” refers to a host that has an aggregate link to two or more TRILL RBridges, where the aggregate link includes multiple physical links to the different RBridges. The aggregate link functions as one logical link to the host. “Multi-homed host” may also refer to a host coupled to TRILL RBridges which do not form a logical link aggregation and do not form an association with each other. This could be the case where a host has multiple logical networking entities (an example is a virtualized server where different servers may be coupled to different networks through different network ports in the system). A single host can have multiple virtual RBridge identifier assignments.
The term “frame” refers to a group of bits that can be transported together across a network. “Frame” should not be interpreted as limiting embodiments of the present invention to layer-2 networks. “Frame” can be replaced by other terminologies referring to a group of bits, such as “packet,” “cell,” or “datagram.”
The term “RBridge identifier” refers to a group of bits that can be used to identify an RBridge. Note that the TRILL standard uses “RBridge ID” to denote the 48-bit intermediate-system-to-intermediate-system (IS-IS) System ID assigned to an RBridge, and “RBridge nickname” to denote the 16-bit value that serves as an abbreviation for the “RBridge ID.” The “RBridge identifier” used in this disclosure is not limited to any bit format, and can refer to “RBridge ID,” “RBridge nickname,” or any other format that can identify an RBridge.
Without virtual RBridge identifier assignment, host 170 would be “transparent” to the rest of the TRILL network. The frames sent from host 170 to the TRILL network are native Ethernet frames. An RBridge in the TRILL network would associate the Media Access Control (MAC) addresses for host 170 with an ingress RBridge (i.e., the first RBridge in the TRILL network that receives these Ethernet frames). In addition, without virtual RBridge identifier assignment, the multi-homing-style connectivity would not provide the desired result, because the TRILL protocol depends on MAC address learning to determine the location of end stations (i.e., to which ingress RBridge an end station is coupled) based on a frame's ingress TRILL RBridge ID. As such, a host can only appear to be reachable via a single physical RBridge. For example, assume that host 150 is in communication with host 170. When RBridge 161 receives frames from host 170 and performs MAC address learning, RBridge 161 would assume that the host is coupled to one of RBridges 162, 164, or 165. Consequently, only one of the physical links leading to host 170 is used for subsequent traffic from host 160 to host 170.
Host 170 has its links to RBridges 162, 164, and 165 configured as a link aggregation (LAG). In other words, host 170 can distribute ingress traffic entering the TRILL network among the three links using link aggregation techniques. Such techniques can include any multi-chassis trunking techniques. In addition, RBridges 162, 164, and 165 are configured to process ingress frames from host 170 such that these frames will have the virtual RBridge nickname in their TRILL header as the ingress RBridge. When these frames are forwarded to the rest of the TRILL network with their respective TRILL headers, other RBridges in the network treat them as originating from virtual RBridge 180.
During operation, each physical RBridge sends TRILL HELLO messages to its neighbor to confirm its health. Each RBridge also sends link state protocol data units (LSPs) to its neighbor, so that link state information can be exchanged and propagated throughout the TRILL network. As illustrated in
More details on multi-homed end stations and virtual RBridges can be found in U.S. application Ser. No. 12/725,249, filed 16 Mar. 2009, entitled “Redundant Host Connection in a Routed Network,” by inventors Somesh Gupta, Anoop Ghanwani, Phanidhar Koganti, and Shunjia Yu (Attorney Docket number BRCD-112-0439.US.NP) and U.S. application Ser. No. 12/730,749, filed 24 Mar. 2010, entitled “Method and System for Extending Routing Domain to Non-routing End Stations,” by inventors Pankaj K. Jha and Mitri Halabi (Attorney Docket number BRCD-3009.US.NP), the disclosures of which are incorporated by reference herein.
Load balancing at layer 2 traffic to be spread among multiple layer-2 data paths. In embodiments of the present invention, remote load balancing allows traffic sharing among multiple egress devices to which a destination host is coupled. For example, in the TRILL network shown in
In the above example illustrated in
Load balancing can be achieved by frame distribution policies. A simple example is a round-robin policy where, for each incoming frame destined to a multi-homed end station, a different egress RBridge is selected, so that frames are spread evenly across all links. Frame distribution policies can also rely on a hash method: it computes a hash value of certain fields in the frame header based on a load balancing configuration. Hash-based load balancing ensures that data path selections are consistent even when the list of available egress switching device is modified in the network.
One advantage of assigning a virtual RBridge identifier to a non-TRILL switch is to facilitate connectivity across multiple physical RBridges, which in turn provides protection against both link and node failures.
RBridge 665 may still receive some frames destined to host 670 before the TRILL network topology converges. Since RBridges 662 and 664 can both be used to reach host 670, RBridge 665 can forward these frames to RBridge 662 or 664. Thus, minimum service interruption can be achieved during link failure. Similarly, in the case of node failure (e.g., when RBridge 665 fails), host 670 can continue operation with virtual RBridge 180. Furthermore, RBridge 665 disassociates itself with virtual RBridge 680. The routing function distributes an update to the virtual RBridge-to-physical RBridge mapping information, so that virtual RBridge 680 is only associated with physical RBridges 662 and 664.
In summary, embodiments of the present invention provide a method and system for facilitating load balancing in a high-availability network. In one embodiment, a virtual RBridge is formed to accommodate an aggregate link from a host to multiple physical RBridges. Data frames are forwarded to the host via at least two data paths, each of which leads to a respective egress RBridge coupled to the host. Such a configuration provides a scalable and flexible solution to remote load balancing in a TRILL network.
The methods and processes described herein can be embodied as code and/or data, which can be stored in a computer-readable non-transitory storage medium. When a computer system reads and executes the code and/or data stored on the computer-readable non-transitory storage medium, the computer system performs the methods and processes embodied as data structures and code and stored within the medium.
The methods and processes described herein can be executed by and/or included in hardware modules or apparatus. These modules or apparatus may include, but are not limited to, an application-specific integrated circuit (ASIC) chip, a field-programmable gate array (FPGA), a dedicated or shared processor that executes a particular software module or a piece of code at a particular time, and/or other programmable-logic devices now known or later developed. When the hardware modules or apparatus are activated, they perform the methods and processes included within them.
The foregoing descriptions of embodiments of the present invention have been presented only for purposes of illustration and description. They are not intended to be exhaustive or to limit this disclosure. Accordingly, many modifications and variations will be apparent to practitioners skilled in the art. The scope of the present invention is defined by the appended claims.
This application claims the benefit of U.S. Provisional Application No. 61/427,437, Attorney Docket Number BRCD-3056.0.1.US.PSP, entitled “Method and System for Remote Load Balancing in High-Availability Networks,” by inventors John Michael Terry, Mandar Joshi, Phanidhar Koganti, and Shunjia Yu, and Anoop Ghanwani, filed 27 Dec. 2010, the disclosure of which is incorporated by reference herein. The present disclosure is related to U.S. patent application Ser. No. 12/725,249, (attorney docket number BRCD-112-0439US), entitled “REDUNDANT HOST CONNECTION IN A ROUTED NETWORK,” by inventors Somesh Gupta, Anoop Ghanwani, Phanidhar Koganti, and Shunjia Yu, filed 16 Mar. 2010; and U.S. patent application Ser. No. 13/087,239, (attorney docket number BRCD-3008.1.US.NP), entitled “VIRTUAL CLUSTER SWITCHING,” by inventors Suresh Vobbilisetty and Dilip Chatwani, filed 14 Apr. 2011; the disclosures of which are incorporated by reference herein.
Number | Date | Country | |
---|---|---|---|
61427437 | Dec 2010 | US |