The disclosure relates generally to communication networks and, more specifically but not exclusively, to providing fault-resilient services within communication networks.
The use of fault resiliency in communication networks continues to grow. Fault resiliency may be used in various types of communication networks. Fault resiliency may be provided for broadcast services, multicast services, and unicast services.
Various deficiencies in the prior art are addressed by embodiments for related to providing fault resiliency in communication networks.
In at least some embodiments, an apparatus includes a processor and a memory communicatively connected to the processor. The processor is configured to receive a graph representing a topology of at least a portion of a network, where the graph includes a set of nodes and a set of links and the set of nodes includes a multicast source node. The processor is configured to partition the graph into a pair of partitions including a first partition and a second partition, where the first partition includes each of the nodes of the set of nodes and a first subset of links of the set of links and the second partition includes each of the nodes of the set of nodes and a second subset of links of the set of links. The processor is configured to construct, based on the pair of partitions, a pair of point-to-multipoint (P2MP) trees including a first P2MP tree and a second P2MP tree, where the first P2MP tree is constructed based on the first partition and the second P2MP tree is constructed based on the second partition.
In at least some embodiments, a computer-readable storage medium stores instructions which, when executed by a computer, cause the computer to perform steps of a method. The method includes receiving a graph representing a topology of at least a portion of a network, where the graph includes a set of nodes and a set of links and the set of nodes includes a multicast source node. The method further includes partitioning the graph into a pair of partitions including a first partition and a second partition, where the first partition includes each of the nodes of the set of nodes and a first subset of links of the set of links and the second partition includes each of the nodes of the set of nodes and a second subset of links of the set of links. The method further includes constructing, based on the pair of partitions, a pair of P2MP trees including a first P2MP tree and a second P2MP tree, where the first P2MP tree is constructed based on the first partition and the second P2MP tree is constructed based on the second partition.
In at least some embodiments, a method includes using a processor and a memory to perform a set of steps. The method includes receiving a graph representing a topology of at least a portion of a network, where the graph includes a set of nodes and a set of links and the set of nodes includes a multicast source node. The method further includes partitioning the graph into a pair of partitions including a first partition and a second partition, where the first partition includes each of the nodes of the set of nodes and a first subset of links of the set of links and the second partition includes each of the nodes of the set of nodes and a second subset of links of the set of links. The method further includes constructing, based on the pair of partitions, a pair of P2MP trees including a first P2MP tree and a second P2MP tree, where the first P2MP tree is constructed based on the first partition and the second P2MP tree is constructed based on the second partition.
In at least some embodiments, an apparatus includes a processor and a memory communicatively connected to the processor. The processor is configured to receive a packet via a first redundant tree (RT) rooted at a source node and configured to serve a set of destination nodes, where the first RT has a first tree identifier associated therewith and the packet includes the first tree identifier of the first RT. The processor is configured to, based on detection of a failure on the first RT, modify the packet by replacing the first tree identifier with a second tree identifier of a second RT rooted at the source node and configured to serve the set of destination nodes. The processor is configured to propagate the modified packet via the second RT.
In at least some embodiments, a computer-readable storage medium stores instructions which, when executed by a computer, cause the computer to perform steps of a method. The method includes receiving a packet via a first RT rooted at a source node and configured to serve a set of destination nodes, where the first RT has a first tree identifier associated therewith and the packet includes the first tree identifier of the first RT. The method further includes, based on detection of a failure on the first RT, modifying the packet by replacing the first tree identifier with a second tree identifier of a second RT rooted at the source node and configured to serve the set of destination nodes. The method further includes propagating the modified packet via the second RT.
In at least some embodiments, a method includes using a processor and a memory to perform a set of steps. The method includes receiving a packet via a first RT rooted at a source node and configured to serve a set of destination nodes, where the first RT has a first tree identifier associated therewith and the packet includes the first tree identifier of the first RT. The method further includes, based on detection of a failure on the first RT, modifying the packet by replacing the first tree identifier with a second tree identifier of a second RT rooted at the source node and configured to serve the set of destination nodes. The method further includes propagating the modified packet via the second RT.
The teachings herein can be readily understood by considering the following detailed description in conjunction with the accompanying drawings, in which:
To facilitate understanding, identical reference numerals have been used, where possible, to designate identical elements common to the figures.
In general, various capabilities for providing fault-resilient services within communication networks are presented. The fault-resilient services may include broadcast services, multicast services, unicast services, or the like, as well as various combinations thereof. In at least some embodiments, a fault-resiliency capability supports local protection for communication services using redundant trees (RTs), such as local protection for unicast services, local protection for multicast services, or the like, as well as various combinations thereof. In at least some embodiments, a link-coloring capability supports establishment of RTs, such as RTs which may be used to support various types of communication services (e.g., unicast services, multicast services, broadcast services, or the like, as well as various combinations thereof). Various embodiments of such capabilities for providing fault-resilient services may be better understood when considered within the context of an exemplary communication networks, such as the exemplary communication network depicted in
The communication network 100 includes a set of nodes 110, including a root node 110R (denoted as r) and seven leaf nodes 110L (denoted as a, b, c, d, e, f, and g). The nodes 110 are configured for transmitting and receiving traffic via communication paths. It is noted that the underlying communication links between the nodes 110 are omitted from
The communication network 100 is configured such that the nodes 110 support a pair of RTs 120. The pair of RTs 120 is configured such that each node 110 is associated with two redundant trees (illustratively, a primary RT 120P which is denoted by solid lines and a secondary RT 120S which is denoted by dashed lines) that induce two node-disjoint paths from root node 110R to each leaf node 110L. The pair of RTs 120 is configured such that, under any single failure scenario, each node 110 has reachability on at least one of the RTs 120. The pair of RTs 120 may be computed based on any suitable process for calculating a pair of RTs within a communication network. For example, the pair of RTs 120 may be computed based on one or more of the paper “Redundant Trees for Preplanned Recovery in Arbitrary Vertex-Redundant or Edge-Redundant Graphs” (by Medard et al., published in IEEE/ACM TRANSACTIONS ON NETWORKING), the paper entitled “Link-Coloring Based Scheme for Multicast and Unicast Protection” (by Bejerano et al., published in Bell-Labs TM ITD-10-49734J), embodiments of the link-coloring capability depicted and described herein with respect to
The communication network 100 may include any suitable type of communication network in which RTs may be used for transport of traffic. For example, communication network 100 may be a datacenter network in which root node r is a physical or virtual server and the leaf nodes are physical or virtual clients. For example, communication network 100 may be a content distribution network in which root node r is a content and the leaf nodes are end user devices which may request content from the content server. The communication network 100 may be any other suitable type of communication network in which RTs may be used for transport of traffic. However, for purposes of clarity in describing various embodiments of the fault-resiliency capability, it is assumed that communication network 100 is an Ethernet-based network (e.g., a datacenter network using Ethernet) in which the pair of RTs is implemented as a pair of Virtual Local Area Network (VLAN) spanning trees (VSTs) having different VST Identifiers (VIDs).
As depicted in
As depicted in
As depicted in
As depicted in
It is noted that, although primarily depicted and described with respect to cases in which local protection of multicast traffic is possible, there may be cases in which local protection of multicast traffic is not possible. For example, a given node will not be able to support local protection of multicast traffic where a failure adjacent to the node (e.g., a failure of an adjacent link or node) is associated with both the primary VST and the secondary VST in opposite directions. An example of this case is node f and adjacent link (c,f) depicted in
It is noted that, although primarily depicted and described herein with respect to embodiments in which the VIDs of the primary and secondary VSTs differ by only a single bit, in at least some embodiments the VIDs of the primary and secondary VSTs may differ by more than a single bit, in which case modification of a packet to switch between the VID of the primary VST and the VID of the secondary VST may require more than changing a single bit (e.g., changing multiple bit positions, removing the VID of the primary VST and replacing it with the VID of the secondary VST, or the like).
As discussed above, in at least some embodiments a link-coloring capability supports establishment of RTs, such as RTs which may be used to support various types of communication services (e.g., unicast services, multicast services, broadcast services, or the like, as well as various combinations thereof). For example, the link-coloring capability (or other link coloring techniques) may be used to determine the VST pairs depicted and described with respect to
In at least some embodiments, a computationally-efficient link-coloring procedure facilitates determination of RTs which may be used to provide end-to-end protection for dynamic multicast and unicast connections. In general, a pair of RTs connects the source of a multicast connection to each destination of the multicast connection in such a way that, in the event of a single link or node failure associated with the multicast connection, each destination of the multicast connection is still connected to the source of the multicast connection in at least one of the two trees in the pair of RTs supporting the multicast connection. In at least some embodiments, the link-coloring procedure is configured such that, for a given source node of a multicast connection in a network, links of the network are colored as red or blue such that, for any destination node of the multicast connection, the red and blue paths are guaranteed to be node disjoint (namely, red and blue trees, constructed using only the red and blue links, respectively, will form a pair of RTs independent of the underlying tree selection method). In at least some embodiments, in network topologies in which such a pair of RTs cannot be constructed due to lack of required path diversity, the link-coloring procedure may be configured to construct two multicast trees such that the red and blue paths from the source node of the multicast connection to any destination node of the multicast connection only share cut links or nodes. It is noted that, although finding an optimal pair of RTs is known to be NP-complete, extensive simulations show that various embodiments of the link-coloring procedure calculate near optimal RTs while substantially outperforming various other solutions for computing RT-pairs.
Reliable delivery of network services is dependent on the existence of a resilient network that can rapidly restore services in the event of a failure. Protection and restoration are among the fundamental building blocks in the realization of network resiliency. Such schemes have been extensively studied both in the context of wireline networks and wireless networks. Protection and restoration of multicast connections also have been studied extensively, although to a relatively lesser extent. With the increasing demand for multicast services (e.g., broadcast TV, multi-party video conferencing, content distribution, distance learning classrooms, and the like) there is an increasing need for efficient mechanisms for protection of multicast connections. One of the commonly proposed preplanned multicast recovery solutions is the redundant trees (RTs) approach. In general, given a source node (also referred as a root node), and a set of destinations, RTs based multicast restoration is achieved by constructing two trees rooted at the source node such that the source node remains connected to each of the destinations in the event of a single link or node failure. An RT-pair is considered to be optimal if its total cost is minimal. In general, such trees are called redundant spanning trees if the destinations are all the non-source nodes in the network, otherwise, they are called redundant Steiner trees.
RT-based protection schemes are broadly applicable across many networking technologies for both multicast and unicast services. For multicast services, for example, RT-based schemes have been proposed for Synchronous Optical Network (SONET)/Synchronous Digital Hierarchy (SDH) networks, Multiprotocol Label Switching (MPLS) networks, Wavelength Divisional Multiplexing (WDM) networks, distributed systems, and the like. For unicast traffic, for example, RT-based schemes (e.g., in which the root of an RT-pair typically serves as the destination and each node in the network is connected with two node-disjoint paths to the root of the RT-pair) have been proposed for restoration of IP unicast traffic, wireless sensor networks, or the like.
While various solutions have been proposed for the problem of computing RTs, such solutions have a number of drawbacks. Various integer-linear programming (ILP) based heuristics have been proposed for the problem of finding optimal redundant Steiner trees (which is known to be NP-complete), however, such ILP based heuristics are computationally intensive and may produce a fractional solution that does not specify two RTs. These drawbacks prohibit the use of ILP-based methods for most practical usage. Similarly, various combinatorial algorithms have been proposed for constructing redundant spanning trees without providing any guarantee on the quality of the solution; however, such combinatorial algorithms typically model the network as an undirected graph and may not find RT-pairs for directed graphs at all, even when such solutions exist. Namely, an undirected graph model may not be expressive enough to accommodate practical networks, for instance, where the residual capacities of a link are not the same in both directions, or where a different cost metric is needed for each direction.
In at least some embodiments, a computationally-efficient link-coloring procedure is configured to support computation of near-optimal (low cost) RTs. In at least some embodiments, a computationally-efficient link-coloring procedure is configured to support computation, provisioning, and maintenance of near-optimal (low cost) RTs. In at least some embodiments, a computationally-efficient link-coloring procedure is configured to support computation of near-optimal (low cost) RTs in directed graphs. In at least some embodiments, a computationally-efficient link-coloring procedure is configured to support computation of near-optimal RTs for networks with cut nodes or cut links by partitioning the network in such a way that if some node v does not have two node or edge disjoint paths from the multicast source node, then any paths from the multicast source node to node v on the two RTs share only the necessary cut links/nodes. In at least some embodiments, a computationally-efficient link-coloring procedure is configured to maintain connectivity of RTs in the event of topology changes (e.g., gracefully handling topology changes (including addition of new destinations) without affecting existing RTs (as opposed to procedures using naïve construction of RTs pairs, which may prevent other nodes from joining the multicast connection without affecting the existing RTs)).
In at least some embodiments, a graph-partitioning based procedure is configured for provisioning and maintaining dynamic RTs, where new destination nodes may join without affecting the existing RTs. In at least some embodiments, a graph-partitioning based procedure is configured for, given a graph representing a network topology and a multicast source node, performing link coloring that logically partitions the network into two redundant directed acyclic subgraphs (RDAGs), which are referred to herein as red and blue RDAGs. The RDAGs are constructed in such a manner that, for any given multicast connection request, red and blue trees can be independently provisioned using any tree selection method at each of the two RDAGs and together induce low cost redundant trees. This allows the flexibility of designing dynamic RTs that meet different objectives. For instance, the primary tree may be a shortest path tree for reducing end-to-end delays, while the standby tree may be a Steiner tree for efficient resource utilization.
In order to describe embodiments of the link-coloring procedure, consider a communication network that supports multicast and unicast connections with a source node r. The network is modeled as a directed graph G(V,E), where each node v∈V is a network element (e.g., router, switch, or the like) and the set of edges E includes the set of directed links between the nodes. A link from node u to node v is denoted by a directed edge (u,v)∈E, where node u is called an “incoming neighbor” of node v and node v is called an “outgoing neighbor” of node u. In this network model, each directed edge e=(u,v)∈E is associated with a positive cost (denoted by ce), where the cost of (u,v) may be different from the cost of (v,u). Additionally, the following definitions consider the network as seen from the source node r of a multicast connection: (a) a non-source node u∈V−{r} is termed reachable if there is a directed path from source node r to node u, otherwise, node u is termed unreachable; (2) a reachable node u is called 2-reachable if G contains 2 node-disjoint paths from source node r to node u, otherwise, the reachable node u is called 1-reachable; (3) a node v (or link e) is termed a cut node (or cut link) of node u if its removal from G makes u unreachable from source node r; and (4) the graph G is termed strong 2-reachable with respect to source node r if every node v∈V−{r} is 2-reachable.
Additionally, in order to describe embodiments of the link-coloring procedure, it may be instructive to consider a definition of RTs. The definition of “RTs” (denoted as Definition 1) may be expressed by the following statements: (1) consider a graph G(V,E) that may include cut nodes and cut links, source node r, and a set of destinations D⊂V−{r}; (2) let TB and TR represent two trees rooted at source node r and providing P2MP connectivity from source node r to destination node D; (3) PB(r,u)(PR(r,u)) denote the path from source node r to node u∈D in TB(TR); and (4) TB and TR are referred to as RTs or an RT-pair if, for every u∈D,PB(r,u) and PR(r,u) are 2-node disjoint or they share only cut nodes and links of node u. As previously indicated, the two trees of an RT-pair may be referred to as blue trees and red trees (although it will be appreciated that any other suitable colors or terms may be used). It is noted that, unlike existing definitions of RTs, Definition 1 above also considers the presence of cut nodes and cut links. An example depicting cut nodes and links is depicted in
Additionally, in order to describe embodiments of the link-coloring procedure, it may be instructive to consider limitations of naïve RT construction. An example is depicted in
Additionally, in order to describe embodiments of the link-coloring procedure, it may be instructive to consider use of embodiments of the link-coloring procedure with the context of embodiments of a link-coloring-based scheme for protecting multicast connections. Embodiments of a link-coloring-based scheme for protecting multicast connections logically partition a given directed graph G(V,E) representing a network of nodes and links into two partitions, including a first partition that includes all of the nodes of the graph and a first subset of links of the graph and a second partition that includes all of the nodes of the graph and a second subset of the links of the graph. The two partitions may be obtained by a link-coloring procedure that is configured to color each link as either red or blue (and in which cut links are colored both as red and blue) in a way that the two partitions form two RDAGs of directed graph G. These sub-graphs are referred to as the red and blue RDAGs, and may be represented by the graphs GB(V,EB) and GR(V,ER), respectively. It is noted that the two RDAGs satisfy Property 1, which states that: (a) for every 2-reachable node u∈V−{r} it holds that any path PR(r,u) from source node r to node u in GR is node disjoint from any path PB(r,u) from source node r to node u in GB; and (2) for every 1-reachable node u∈V−{r} it holds that any path PR(r,u) from source node r to node u in GR shares only cut nodes and cut links of node u with a path PB(r,u) from source node r to node u in GB. If there are no cut links or cut nodes in G, then (EB∩ER)=Ø; otherwise, (EB∩ER) may include only cut links. For example,
In at least some embodiments, a link-coloring procedure is configured to calculate red and blue RDAGs that satisfy Property 1 of a given graph G(V,E) with source node r. Here, it is assumed that G is strong 2-reachable; however, this assumption may be removed for some embodiments (e.g., when the graph G include cut nodes or cut links). In at least some embodiments, the link-coloring capability is configured to calculate red and blue RDAGs that satisfy Property 1 of a given graph G(V,E) with source node r by constructing a node arrangement satisfying a set of properties and then using the node arrangement to compute two logical partitions, which form the red and blue RDAGs, based on the node arrangement. An exemplary embodiment of a method for determining an RT-pair based on a node arrangement is depicted in
The construction of a node arrangement for use in computing two logical partitions may be better understood by considering the following definitions, properties, and observations.
Let L be an ordered list of the nodes v∈V of graph G, where the first and last elements in the list are r and every other element in the list uniquely represents one of the non-source nodes v∈V−{r}. The list L is termed a complete node arrangement if it contains every node v∈V and every non-source node has an incoming neighbor both before and after it in list L. In order to calculate a complete or partial node arrangement, an auxiliary data structure (referred to herein as a skeleton list) is used. Given a graph G(V,E) and r∈V, a node collection for graph G is defined as a collection {circumflex over (L)}={U1={r}, U2 . . . , Um={r}} with three or more sets of nodes that together include all the nodes, V. The first and the last sets, U1 and Um, include only the source node r (note that only source node r is represented twice in the list L). Every other set, Uj, includes one or more non-source nodes and, further, is pairwise disjoint with every other set. Each set Uj includes a special node that is designated as the set anchor and is denoted by anchor(Uj). By default, anchor(U1)=anchor(Um)=r. Also, if u∈Uj and anchor(Uj)=v we say that anchor(u)=v.
Consider the following definition (denoted herein as Definition 2) of a skeleton link: a skeleton list is a collection {circumflex over (L)}={U1={r}, U2, U3, . . . Um={r}} of node sets that satisfies the following two conditions for every set Uj∈{circumflex over (L)},1<j<m: (a) there is a directed path from anchor(Uj) to every other node in (Uj) and (b) anchor(Uj) has incoming neighbors in sets both before and after (Uj) in {circumflex over (L)}.
Consider a link (u,v)∈E with end-points in two different sets in {circumflex over (L)}, and let UU and Uv be the sets that contain nodes u and v, respectively. If UU is before Uv is {circumflex over (L)} then (u,v) is referred to as a forward link; otherwise, (u,v) is referred to as a backward link. A link (u,v)∈E with end-points in the same set in {circumflex over (L)} is referred to as an internal link. For a node v∈V−{r}, a path from source node r to node v is referred to as a forward path if it includes only forward and internal links. Similarly, a path is referred to as a backward path if it includes only backward and internal links.
Consider the following property (denoted herein as Property 2) which is satisfied by the skeleton lists: given a graph G(V,E), a source node r, and a skeleton list {circumflex over (L)}={U1={r}, U2 . . . Um={r}}, for every anchor v∈V−{r}, the skeleton list {circumflex over (L)} induces at least one forward path and one backward path from source node r to node v and every forward path from source node r to node v is node-disjoint from any backward path from source node r to node v. If every node v∈V is an anchor in the skeleton list {circumflex over (L)}, then the skeleton list {circumflex over (L)} is said to be a complete node arrangement; otherwise, it defines a partial node arrangement. A partial node arrangement where every 2-reachable node is an anchor node is said to be an unrefinable partial node arrangement.
The construction of the node arrangement may be based on the following observation (denoted herein as Observation 1): given a graph G(V,E), source node r∈V, and a spanning tree T rooted at source node r: consider any subtree {circumflex over (T)} (≠T) of T with root node w∈V−{r}, then for any 2-reachable node u∈{circumflex over (T)}−{w} there is a path from source node r to node u that does not include node w. Note that, if this is not the case, w is a cut node for node u, which implies that u is not 2-reachable. The construction of the node arrangement may include constructing a spanning tree rooted at source node r, creating an initial skeleton list (denoted as {circumflex over (L)}) based on the spanning tree, and iteratively refining the skeleton list {circumflex over (L)} based on observation 1 until every 2-reachable node is an anchor node. It is noted that if the given graph is strong 2-reachable then the final skeleton-list provides a complete node arrangement (denoted herein as Theorem 1); otherwise, the skeleton list {circumflex over (L)} provides an unrefinable partial node arrangement, in which case, it can be shown that for every set U∈{circumflex over (L)} with two or more nodes, anchor(U) is a cut node for all of the other nodes in U.
In at least some embodiments, the creation of an initial skeleton list {circumflex over (L)} based on a spanning tree, for use in constructing the node arrangement, may be performed as follows. Here, assume that the spanning tree is a directed spanning tree T rooted at source node r. An initial skeleton list {circumflex over (L)} is constructed based on the spanning tree. The initial skeleton list {circumflex over (L)} includes |S|+2 sets, where S is the set of outgoing neighbors of source node r in the directed spanning tree T. The first and last sets of the skeleton list {circumflex over (L)} include only source node r. The other sets of the skeleton list {circumflex over (L)} may be created as follows: (1) for every node v∈S, a set Uv is created that includes all of the nodes in the sub-tree of directed spanning tree T rooted at node v, and (2) the set Uv is then inserted into the skeleton list {circumflex over (L)} with anchor(U)=v and ∀u∈U,anchor(u)=v. It is noted that, independent of the order in which these sets are inserted into the skeleton list {circumflex over (L)} between the first and last sets containing source node r, the result is a valid skeleton list {circumflex over (L)} by definition since the first and the last sets contain source node r and source node r is an incoming neighbor of anchor(Uj) of every other set Uj∈{circumflex over (L)}. Therefore, skeleton list {circumflex over (L)} satisfies Property 2. An exemplary embodiment of a method for constructing a node arrangement for use in determining an RT-pair based on an iterative skeleton list refinement is depicted in
In at least some embodiments, the iterative refinement of the skeleton list {circumflex over (L)}, for use in constructing the node arrangement, may be performed as follows. The following steps may be performed at each iteration. First, a set Uv with anchor(Uv)=v that includes a non-anchor node w with an incoming neighbor u in a different set is identified. Then, a new set Uw with anchor(Uw)=w is created, and node w and the spanning tree descendents of node w in set Uv are moved into new set Uw. The new set Uw is then placed in the skeleton list {circumflex over (L)} between the set Uv and the set including node u, (which is denoted by UU). If set UU appears before set Uv in skeleton list {circumflex over (L)}, then set Uw is inserted just before set Uv in skeleton list {circumflex over (L)}; otherwise, set Uw is inserted immediately after set Uv in skeleton list {circumflex over (L)}. It is noted that this continues to be a valid node arrangement since the newly inserted anchor(Uw) has incoming neighbors on both sides and they are in the sets UU and Uv. The node u is in UU and since node w was in Uv before this refinement, there must be some node x∈Uv such that (x,w) is a link in the initial spanning tree T. Accordingly, as discussed above, the refinement process repeatedly identifies a link (u,w)∈E such that anchor(u)≠anchor(w) and anchor(w)≠w and then adds into the skeleton list {circumflex over (L)} a new set with its anchor as w. It is noted that, from Observation 1 it holds that if a set U includes non-anchor 2-reachable nodes, it has such a node w. Thus, the refinement process ends when every 2-reachable node is an anchor node of a set in the skeleton list {circumflex over (L)}. An exemplary embodiment of a method for use in performing an iterative skeleton list refinement is depicted in
In at least some embodiments, use of the node arrangement to compute two logical partitions which form the red and blue RDAGs (which also may be referred to as coloring the links of the graph for computing the red and blue RDAGs) may be performed as follows. Consider a link (u,v)∈E, where u≠r. If link (u,v) is a forward link it is colored red; otherwise, if link (u,v) is a backward link it is colored blue. The internal links are considered later. For all links (r,u)∈E where node u is an outgoing neighbor of source node r, special consideration is necessary. Since source node r is both to the left and right of node u in skeleton list {circumflex over (L)},(r,u) could be either red or blue. The coloring of such a link may be performed in a manner for ensuring that the two RDAGs induce two node disjoint paths to every outgoing neighbor node u of source node r. This can be achieved only if every such node u has another non-source incoming neighbor with a link of a different color. Accordingly, the coloring of such a link may be performed by: (1) if all the incoming non-source neighbors of node u appear after (before) node u in skeleton list {circumflex over (L)}, then the link (r,u) is colored red (blue) and added to the red (blue) RDAG, (2) if node u does not have any other incoming neighbor beside source node r then link (r,u) is a cut link and, thus, is colored both red and blue and added to both of the RDAGs, or (3) in other cases, the link (r,u) may be added to one of the two RDAGs based on suitable criteria (e.g., balancing the number of red and blue outgoing links from the source node r or any other suitable criteria). As a result of Property 2, for any given strong 2-reachable graph G(V,E), the above procedure computes red and blue RDAGs that satisfy Property 1. The red and blue RDAGs ensure that, for every node v∈V−{r}, the red (blue) RDAG includes all of the possible forward (backward) paths from source node r to node v and any pair of forward and backward paths from source node r to node v are node disjoint. As an example,
At step 1301, method 1300 begins.
At step 1305, a (next) link is selected for coloring. The link is selected from the set of links of the graph not previously selected for coloring within the context of method 1300.
At step 1310, a determination is made as to whether the selected link is an outgoing neighbor of the multicast source node. If the selected link is not an outgoing neighbor of the multicast source node, method 1300 proceeds to step 1315. If the selected link is an outgoing neighbor of the multicast source node, method 1300 proceeds to step 1330.
At step 1315, a determination is made as to whether the selected link is a forward link or a backward link. If the selected link is a forward link, method 1300 proceeds to step 1320, at which point the link is colored the first color associated with the first RT. If the selected link is a forward link, method 1300 proceeds to step 1325, at which point the link is colored the second color associated with the second RT. From steps 1320 and 1325, method 1300 proceeds to step 1335.
At step 1330, the link is colored either the first color or the second color in a manner tending to ensure that the first RT and the second RT induce two node-disjoint paths to each outgoing neighbor of the multicast source node. For example, the link may be colored the first color based on a determination that all incoming non-source neighbor nodes of the outgoing neighbor node appear after the outgoing neighbor node in the node arrangement, or may be colored the second color otherwise. For example, the link may be colored both the first color and the second color based on a determination that the only incoming neighbor of the outgoing neighbor node is the multicast source node. For example, the link may be colored the first color or the second color based on one or more criteria. From step 1330, method 1300 proceeds to step 1335.
At step 1335, a determination is made as to whether the final link has been selected for coloring. If the final link has not been selected for coloring, method 1300 returns to step 1305, at which point a next link is selected for coloring. If the final link has been selected for coloring, method 1300 proceeds to step 1399, at which point method 1300 ends.
At step 1399, method 1300 ends.
It will be appreciated that, although primarily depicted and described with respect to embodiments in which links are selected for coloring without regard for whether the links are outgoing neighbors of the multicast source node, in at least some embodiments, links may be processed as two groups based on whether the links are outgoing neighbors of the multicast source node (e.g., performing steps 1315, 1320, and 1325 for each link that is not an outgoing neighbor of the multicast source node; performing step 1330 for each link that is an outgoing neighbor of the multicast source node).
As described above, various embodiments of the link-coloring procedure may be configured to support handling of a graph including cut nodes or cut links. Here, the link-coloring procedure may be configured to determine two logical partitions that satisfy Property 1. Consider such a graph G(V,E) with source node r and let {circumflex over (L)} be a refined skeleton list of G after the link-coloring procedure as discussed above (namely, link coloring without accounting for cut nodes or cut links) has been applied. If the refined skeleton list {circumflex over (L)} is a complete node arrangement, then G is strong 2—reachable (without cut nodes or cut links) and no further processing is needed; otherwise, refined skeleton list {circumflex over (L)} defines an unrefinable partial node ordering and there exist some internal links that are not yet colored. It is noted that every 2-reachable node u∈V−{r} is an anchor node in the refined skeleton list {circumflex over (L)}. In addition, from Property 2 it follows that every anchor node u∈V−{r} has both forward and backward paths from source node r, and every pair of forward and backward paths from source node r to node u are node-disjoint. Since all of the forward and backward links are already colored red and blue, respectively, calculation of the red and blue partitions may be finalized by coloring the internal links (which have not yet been colored) in a way that preserves Property 1. In other words, every node u∈V−{r} should have both forward and backward paths comprised only of red or blue links (no internal links), respectively. If node u is 2-reachable then every pair of forward and backward paths of node u are node disjoint; otherwise, the forward and backward paths of node u are only allowed to share cut nodes and cut links of node u. The coloring of internal links is based on the following observation (denoted as Observation 2): considering a set U∈{circumflex over (L)} with two or more nodes and anchor node anchor(U)=w and recalling that every node v∈U−{w} is 1-reachable where node w is its cut node, then every path from source node r to any node v∈U−{w} passes through node w. Therefore, (a) there is no link (x,v)∈E such that x∉U (i.e., every incoming link of a node v∈U−{w} is an internal link) and (b) as a result of (a), any path from node w to any node in U−{w} includes only nodes in U. Thus, in order to color the internal links, the following condition must be satisfied: there are red and blue paths from node w to every node v∈U−{w}, and every red path (PR(w,v)) and any blue path (PB(w,v)) may share only cut nodes/links of node v in the set U. It is noted that this condition is equivalent to Property 1 for the graph H(U,EU) including all of the nodes in U and their internal links (where anchor(U) is the source). Thus, the link-coloring procedure as discussed above (namely, link coloring without accounting for cut nodes or cut links) can be applied recursively and the recursion is guaranteed to terminate since the depth of recursion can be, at most, |U|−1. It is noted that, in addition to handling of cut nodes, any cut link (u,v),u≠r also may be handled since, in such case, node u is also a cut node.
As described above, various embodiments of the link-coloring procedure may be configured to support handling of topology changes (e.g., a failure of a node or link, an addition of a node or link, or the like). In the case of a topology change, it is desirable to minimize the impact of the topology change on the RT-pairs of active multicast connections. Accordingly, in at least some embodiments, the link-coloring procedure may be configured to modify the red and blue RDAGs for maintaining Property 1 in a manner for minimizing the number of links for which the associated link color is changed. The handling of topology changes may be performed using a re-partitioning process that is configured to modify the red and blue RDAGs for maintaining Property 1 in a manner for minimizing the number of links for which the associated link color is changed. It is noted that a link changes its color only when the nodes which form the two endpoints of the link change their relative order in the node ordering L. Thus, the repartitioning procedure may be configured to minimize the number of neighboring nodes that change their relative order in the node ordering L. For purpose of simplifying the description of handling of topology changes, consider a single topology change event at a time.
In at least some embodiments, the repartitioning procedure may be configured to handle addition of a node or a link. Typically, adding a new element does not affect an existing RT-pair, unless the RT-pair includes cut nodes or cut links. In the case in which the RT-pair includes cut nodes or cut links, the new element may provide alternative paths that bypass some of these elements.
In at least some embodiments, the repartitioning procedure may be configured to handle addition of a node or a link in a strong 2-reachable graph. For a strong 2-reachable graph, the addition of an element does not cause color change to any existing link; rather, a new link is colored either red or blue based on whether it is a forward or a backward link, respectively, as defined by L. A new node with two or more incoming neighbors is inserted between its incoming neighbors in L and its links are colored according to its location in L. If the new node has a single incoming link, then this is a cut link and it is inserted to both RDAGs. The node is inserted to the set U∈{circumflex over (L)} of its incoming neighbor.
In at least some embodiments, the repartitioning process may be configured to handle addition of a node or a link in a 1-reachable graph. It is noted that special treatment may be required when the graph includes cut elements. Consider a 1-reachable node, say node w, with a new incoming edge and let G denote the graph before adding the new edge. Let node u be the closest cut node of node w in any path from source node r to node w in graph G, and let Tu be the sub-tree of T (the original spanning tree of G) rooted at node u and including all of the 1-reachable nodes descending from node u. In order to re-recalculate the RDAGs, the sub-tree Tu is assigned to the set Uj with anchor node u and the skeleton list refinement procedure is again invoked. After this procedure, some of the nodes in sub-tree Tu may be detected as 2-reachable nodes and node u ceases to be a cut node for those nodes. It is noted that various embodiments of the repartitioning process that may be configured to handle addition of a node or a link may be better understood by way of the example depicted in
In at least some embodiments, the repartitioning process may be configured to handle failure of a node or a link. It is noted that, after recalculating the RDAGs, the color of some links may have changed. Thus, a tree of a given RT-pair may include both red and blue links. This does not create any concern if the failed element is not included in the RT-pair and all the destination nodes are still reachable on both trees. Otherwise, the paths for affected destinations are rerouted for providing maximal protection to these destinations. In this case, paths that include both red and blue links also may be rerouted. As indicated above, removal of a node or a link does not necessarily change the RDAGs, if each non-source node has both red and blue incoming edges. If some nodes have incoming links (two or more) only of a single color either blue or red, their location in L is modified for maintaining Property 1. Such nodes are referred to herein as affected nodes. Assume that after a topology change some nodes are affected. Let node w be the right (left) most affected node with only red (blue) incoming links. We move node w forward (backward) in the node ordering to ensure that it has both red and blue incoming links. In order to minimize the number of link color changes, find the closest incoming neighbor of node w in the skeleton list {circumflex over (L)} (e.g., node u in the set Uj∈{circumflex over (L)}) and add node w (and potentially all of the other nodes in its set in node w) as a descendent of node u to the set Uj. It is noted that, after this operation, the color of some of the outgoing links of node w may change and, thus, some other node may be affected. This process is repeated until there are no more affected anchor nodes with all their incoming links of the same color. After this set consolidation process, the skeleton list refinement procedure is executed in order to recalculate the new positions of the affected nodes. Since each affected node is inserted into the closest set in {circumflex over (L)}, in which it has an incoming neighbor, only one of its incoming links should change its color. This ensures minimal modifications to the RDAGs. It is noted that various embodiments of the repartitioning process that may be configured to handle failure of a node or a link may be better understood by way of the example depicted in
Various embodiments depicted and described herein may be used for constructing or modifying an RT-pair. An exemplary embodiment of a method for constructing an RT-pair is depicted and described in
At step 1701, method 1700 begins.
At step 1710, a graph is received. The graph represents a topology of at least a portion of a network. The graph includes a set of nodes and a set of links, where the set of nodes includes a multicast source node. The graph may be a directed graph.
At step 1720, the graph is partitioned into a pair of partitions including a first partition and a second partition. The first partition includes each of the nodes of the set of nodes and a first subset of links of the set of links. The second partition includes each of the nodes of the set of nodes and a second subset of links of the set of links. The first and second partitions are redundant directed acyclic graphs (RDAGs). The partitioning of the graph into the first and second partitions may be performed as depicted and described herein with respect to
At step 1730, a pair of redundant trees (RTs) is constructed based on the pair of partitions. The pair of RTs includes a first RT and a second RT. The first RT is constructed based on the first partition and the second RT is constructed based on the second partition. The construction of the first RT and the second RT may be performed independent of each other. The construction of the first RT based on the first partition may be performed using any suitable tree construction mechanism and, similarly, the construction of the second RT based on the second partition may be performed using any suitable tree construction mechanism. The construction of the first RT and the second RT may be performed using the same tree construction mechanism for the first and second RTs or using different tree construction mechanisms for the first and second RTs. The first and second RTs may be P2MP trees. The first and second RTs may be VSTs. The first and second RTs may be any other suitable types of redundant trees.
At step 1799, method 1700 ends. It will be appreciated that, although method 1700 is depicted and described as ending (for purposes of clarity), various other functions may be performed based on the pair of RTs (e.g., provisioning the RTs in a network for transport of traffic, modification of the RTs based on topology changes in a network, or the like, as well as various combinations thereof).
The computer 1800 includes a processor 1802 (e.g., a central processing unit (CPU) and/or other suitable processor(s)) and a memory 1804 (e.g., random access memory (RAM), read only memory (ROM), and the like).
The computer 1800 also may include a cooperating module/process 1805. The cooperating process 1805 can be loaded into memory 1804 and executed by the processor 1802 to implement functions as discussed herein and, thus, cooperating process 1805 (including associated data structures) can be stored on a computer readable storage medium, e.g., RAM memory, magnetic or optical drive or diskette, and the like.
The computer 1800 also may include one or more input/output devices 1806 (e.g., a user input device (such as a keyboard, a keypad, a mouse, and the like), a user output device (such as a display, a speaker, and the like), an input port, an output port, a receiver, a transmitter, one or more storage devices (e.g., a tape drive, a floppy drive, a hard disk drive, a compact disk drive, and the like), or the like, as well as various combinations thereof).
It will be appreciated that computer 1800 depicted in
It will be appreciated that the functions depicted and described herein may be implemented in software (e.g., via implementation of software on one or more processors, for executing on a general purpose computer (e.g., via execution by one or more processors) so as to implement a special purpose computer, and the like) and/or may be implemented in hardware (e.g., using a general purpose computer, one or more application specific integrated circuits (ASIC), and/or any other hardware equivalents).
It will be appreciated that some of the steps discussed herein as software methods may be implemented within hardware, for example, as circuitry that cooperates with the processor to perform various method steps. Portions of the functions/elements described herein may be implemented as a computer program product wherein computer instructions, when processed by a computer, adapt the operation of the computer such that the methods and/or techniques described herein are invoked or otherwise provided. Instructions for invoking the inventive methods may be stored in fixed or removable media, transmitted via a data stream in a broadcast or other signal bearing medium, and/or stored within a memory within a computing device operating according to the instructions.
It will be appreciated that the term “or” as used herein refers to a non-exclusive “or,” unless otherwise indicated (e.g., use of “or else” or “or in the alternative”).
It will be appreciated that, although various embodiments which incorporate the teachings presented herein have been shown and described in detail herein, those skilled in the art can readily devise many other varied embodiments that still incorporate these teachings.
This application claims the benefit of U.S. Provisional Patent Application Ser. No. 61/843,401, entitled “FAULT-RESILIENT BROADCAST, MULTICAST, AND UNICAST SERVICES,” filed Jul. 7, 2013, which is hereby incorporated herein by reference in its entirety.
Number | Date | Country | |
---|---|---|---|
61843401 | Jul 2013 | US |