The present disclosure relates generally to computer networking, and, more particularly, to association of network segments with respective spanning trees in a computer network.
Cloud computing service models, and in particular infrastructure as a service (IaaS) offerings are becoming increasingly prevalent. To support such offerings, providers may need to implement a large number of different network segments, for example, significantly more than 4096 network segments. In order to meet the segmentation requirements, various segmentation technologies may be utilized. However, one issue confronted with employing certain segmentation technologies is how to associate network segments with respective spanning trees of a spanning tree protocol (STP). Some potential association techniques may undesirably cause frames of different customers to be linked to the same topology, and prevent load balancing on a segment-by-segment basis. Other potential association techniques may lack scalability, since, when scaled, they may call on devices to establish and maintain a number of loop free topologies beyond their computing power. Accordingly there is a need for improved techniques.
The embodiments described herein may be better understood by referring to the accompanying drawings in which like reference numerals indicate identically or functionally similar elements, of which:
According to an embodiment of the present disclosure, a first number of multiple spanning tree instances (MSTIs) are defined within a network. A second number of network segments associated with segmentation identifier (IDs) are also configured, where the first number of MSTIs is less than the second number of segmentation IDs. A memory maintains segmentation ID to MSTI mappings that map each defined segmentation ID of the second number of network segments to one of the first number of MSTIs. A processor computes a segmentation mapping digest of the segmentation ID to MSTI mappings. Multiple spanning tree (MST) bridge protocol data units (BPDUs) are broadcast that include the digest of the segmentation ID to MSTI mappings.
A layer-2 network is collection of nodes, such as bridges and switches, interconnected by links that transports data frames using protocols associated with the Data Link Layer of the Open Systems Interconnection (OSI) Reference Model. Layer-2 networks typically provide the ability to establish dedicated point-to-point links between two nodes, as well as to provide shared media links where nodes at least appear to share a common physical media, for example, an Ethernet local area network (LAN). Layer-2 networks generally rely on hardware-based address, such as media access control (MAC) addresses from nodes' network interface cards (NICs), to decide where to forward frames.
A variety of control protocols may be implemented in layer-2 networks to manage the forwarding of frames. These protocols may include various spanning tree protocols (STPs), such as rapid spanning tree protocol (RSTP) which was standardized in IEEE Std 802.1D-2004 and multiple spanning tree protocol (MSTP) which was standardized in IEEE Std 802.1Q, as well as a variety of other protocols.
Since Layer-2 networks often include redundant links, one function of some control protocols, and in particular STPs, is to calculate one or more active network topologies that are loop-free, thereby breaking any loops that may exist in the network. For example, RSTP calculates and utilizes a single active network topology that is loop-free. In RSTP, ports of switches are assigned port roles that include Root Port role, Designated Port role, Alternate Port role and Backup Port role. They are also assigned port states that include a discarding state, a learning state, and a forwarding state. RSTP elects a single node within the network to be the Root. All ports on the Root are assigned Designated Port role. For each non-Root node, the port offering the best (e.g., lowest cost) path to the Root is assigned the Root Port role. For each non-Root node, the port offering an alternative (e.g., higher cost) path to the Root is assigned the Alternate Port role. For each LAN, the one port through which the lowest cost path to the Root is provided to the LAN is assigned the Designated Port role, while other ports through which an alternative (e.g., higher cost) path to the Root is provided may be assigned the Alternate Port role. Those ports that have been assigned the Root Port role and the Designated Port role are placed in the forwarding state. Further, ports assigned to the Alternate Port role and the Backup Port role are placed in the discarding state. In some cases, ports may be rapidly transitioned between certain states.
MSTP builds upon the basic operation of RSTP to calculate and utilize multiple active network topologies that are loop-free. In MSTP, a network is organized into regions. Within each region, MSTP establishes an Internal Spanning Tree (IST) which provides connectivity to all switches within the respective region and to the ISTs established within other regions. The IST established within each MSTP region also provides connectivity to the one Common Spanning Tree (CST) established outside of the MSTP regions. The IST of a given MST Region receives and sends BPDUs to the CST. Collectively a Common and Internal Spanning Tree (CIST) is formed.
Within each MST region, the switches establish multiple active topologies, each of which is referred to as a multiple spanning tree instance (MSTI). Each MSTI is identified by corresponding multiple spanning tree instance identifier (MSTID) and is basically a simple RSTP instance that exists only inside the respective region.
To obtain the information necessary to run STPs, such as RSTP and MSTP, switches typically exchange special messages, referred to as configuration bridge protocol data units (BPDUs). BPDUs carry information that may be used by recipients in making port role decisions and, in general, in converging upon a loop-free active topology. In MSTP switches do not send separate BPDUs for each MSTI. Instead, every MSTP BPDU carries the information needed to compute the active topology for all of the MSTIs defined within the respective region. MST BPDUs also typically carries a MST Configuration Identifier (ID) that, among other things, is used by switches to determine if they are in the same MST Region.
Layer-2 networks are increasingly being deployed in environments that stress their capabilities. To address these challenges architectures are being deployed in connection with layer-2 networks. One architecture being deployed to address issues confronted by layer-2 networks is cloud switching. Cloud switching architectures (or simply “cloud switches”) typically include a large number of individual switches (referred to herein as “leaf switches”) interconnected by a high-speed interconnect and administered collectively as virtual switches. A cloud switch, through its constituent leaf switches, may provide thousands of external ports to support demanding layer-2 networks.
In order to managing traffic among cloud switch domains, the cloud switch 100 may implement internal logical shared media links among the cloud switch domains. These logical shared media links are referred to herein as “bConnects.” Each cloud switch domain is allowed to have only one logical port coupled to a particular bConnect. Cloud switch domains coupled to the same bConnect are permitted to pass data frames between each other through the fabric interconnect 120. Cloud switch domains that are coupled to different bConnects are prohibited from exchanging data frames with each other through the fabric interconnect 120. They may, however, exchange data frames with each other over external connections, i.e. connections external to the cloud switch 100, for example, provided by one or more external devices. Use of bConnects prevents loops within the cloud switch 100. However, a separate loop prevention mechanism is typically still utilized to prevent external loops, i.e. loops through external devices. Accordingly, a STP is typically still executed to break these external loops
Cloud switches are often called upon to support a cloud computing service model, where a customer is supplied on-demand access to a shared pool of configurable computing resources that can be rapidly provisioned and released. In particular, cloud switches may be suited for providing the cloud offering commonly referred to as infrastructure as a service (IaaS). In IaaS, customers are provided with network, computing and storage resources, on which they are able to deploy and run arbitrary software, including desired operating systems and applications. IaaS offerings generally deploy software within a structure referred to as a virtual private cloud. Applications are deployed within the virtual private cloud using a container referred to as a virtual datacenter. A virtual datacenter is generally a collection of virtual machines (VMs) interconnected using network segments. In this context, a network segment is a partition of infrastructure within which resources may be shared and upon which network communication policies may be enforced. In a configuration with many customers and many virtual datacenter, there may be a need for a large number of different network segments, for example, significantly more than 4096 network segments.
In order to meet the segmentation requirements of such configurations, virtual network (VN) segment technology may be utilized. With VN-segment technology, segmentation is implemented through use of a segmentation ID. In one implementation, the segmentation ID is a 24-bit ID, formed from the concatenation of two 12-bit VLAN IDs provided by two IEEE 802.1Q VLAN tags. Unlike certain other segmentation techniques, such as QinQ, in a VN-segment paradigm the entire segmentation ID may be used for each forwarding decision, rather than a single constituent VLAN identifier. In such manner, certain efficiencies may be achieved.
One issue confronted with employing VN-segment technology is how to associate segments with respective spanning trees. While one may attempt to associate spanning trees, for example MST instances, with one of the constituent VLAN IDs (e.g., the VLAN ID of the outer IEEE 802.1Q VLAN tag 430), such an approach has a number of shortcomings. In VN-segment technology, the segmentation ID 470 is a flat structure, with no limitation on how customers are allocated segmentation IDs. Using one of the constituent VLAN IDs (e.g., the VLAN ID of the outer IEEE 802.1Q VLAN tag 430) would cause all network segments whose segmentation ID incorporates the VLAN ID to use the same loop-free topology. Due to the flat nature of the segmentation IDs, this may undesirably lead to frames of different customers being linked to the same topology. In generally, load balancing would not be possible on a segment-by-segment basis.
Further, while one may attempt to run RSTP per segment, such that each segment has a unique RSTP loop free topology, such an approach lacks scalability. In a typical cloud computing service model, there may be thousands or even hundreds of thousands of customers, leading to potentially millions of network segments. Many switches may lack the computing power necessary to establish and maintain an individual RSTP loop free topology for each of these network segments.
In one embodiment of the present disclosure, network segments are mapped to a limited number of loop free topologies in a flexible and scalable manner. Each domain of a cloud switch may execute a MST protocol, using MST protocol process 370, to define a first number of MSTIs. Each cloud switch domain executes a VN-segment process 375 to configure a second number of network segments that are each associated with respective segmentation IDs. A segmentation ID to MSTI mapping table 600 is maintained, which maps each defined segmentation ID of the second number of network segments to one of the first number of MSTIs. In typically implementations, the first number of MSTIs is significantly less than the second number of segmentation IDs, such that multiple segments share at least some of the MSTIs.
Each cloud switch domain may compute a digest of the domain's segmentation ID to MSTI mapping table (herein referred to as a “segmentation mapping digest”), and broadcast modified MST BPDUs that include the segmentation mapping digest. Further, each cloud switch domain receives modified MST BPDUs from other cloud switch domain. Upon receipt of a modified MST BPDU from a particular other cloud switch domain, the contained segmentation mapping digest is compared with the digest of the local segmentation ID to MSTI mapping table. If there is a match, the particular other cloud switch domain is determined to belong to the same MST region as the local cloud switch domain. Traffic exchanged with the particular other cloud switch domain may be load balanced among the MSTIs according to the mappings in the segmentation ID to MSTI mapping table 600. For each port, per-segment blocking logic may be configured based on the spanning tree port states of the MSTI used for the corresponding segment. Conversely, if there is no match, the particular other cloud switch domain is determined to belong to a different MST region than the local domain. Traffic exchanged with the particular other cloud switch domain is caused to utilize the CIST defined by the MST protocol, without load balancing.
A cryptographic hash may be applied to the segmentation ID to MSTI mapping table 600 to produce a digest. In one embodiment, due to the potentially large size of the segmentation ID to MSTI mapping table 600, in efforts to reduce the chance of collisions, a hash-based message authentication code message-secure hash algorithm 256 (HMAC-SHA256) algorithm may be employed to generate a 256 bit value. Alternatively, a traditional hash-based message authentication code message-digest 5 (HMAC-MD5) algorithm may be employed to generate a 128 bit value. It should be understood that a variety of other digest generating algorithm may also be employed.
In summary, the present disclosure describes embodiments that flexibly and scalably map network segments to a limited number of loop free topologies, in part by configuring segmentation ID to MSTI mapping tables and distributing digests thereof. It should be understood that various adaptations and modifications may be readily made.
For example, while it is described above that certain operations are implemented in cloud switch domains, it should be understood that the techniques may be implemented in a variety of other types of network structures or nodes, which may not be part of a cloud switch. These network structures or nodes may be part of another type of data center architecture that utilizes a large number of network segments, or some other type of architecture.
Further, while it is described above that the segmentation ID is a 24-bit ID used with VN-segment technology, formed from the concatenation of two 12-bit VLAN IDs provided by two IEEE 802.1Q VLAN tags, it should be understood that the segmentation ID may be used with other technologies, and composed of a different number of bits or characters. For example, the segmentation ID may be 36-bit ID, formed, for example from the concatenation of three 12-bit VLAN IDs provided by three IEEE 802.1Q VLAN tags. Alternatively, the segmentation ID may be 36-bit ID stored in a single dedicated field of frames. Alternatively, the segmentation ID may be string of characters, stored in some manner within frames.
Still further, it should be understood that at least some of the above-described embodiments may be implemented in software, in hardware, or a combination thereof. A software implementation may include computer-executable instructions stored in a non-transitory computer-readable medium, such as a volatile or persistent memory, a hard-disk, a compact disk (CD), or other tangible medium. A hardware implementation may include configured processors, logic circuits, application specific integrated circuits, and/or other types of hardware components. Further, a combined software/hardware implementation may include both computer-executable instructions stored in a non-transitory computer-readable medium, as well as one or more hardware components, for example, processors, memories, etc. The above descriptions are meant to be taken only by way of example. It is the object of the appended claims to cover all such variations and modifications as come within the true spirit and scope of the embodiments herein.