Various exemplary embodiments disclosed herein relate generally to computer networking, and more particularly to internet routing.
Traditional routing in Internet Protocol (IP) networks is often along shortest paths using link weight as the metric. It has been observed that under some traffic conditions, shortest path routing may lead to congestion on some links in the network while capacity may be available elsewhere in the network. Segment Routing is a new Internet Engineering Task Force (IETF) protocol to address this problem. The key idea in segment routing is to break up the routing path into segments in order to enable better network utilization. Segment routing may also enable finer control of the routing paths. It may also be used to route traffic through middle boxes.
A brief summary of various exemplary embodiments is presented. Some simplifications and omissions may be made in the following summary, which is intended to highlight and introduce some aspects of the various exemplary embodiments, but not to limit die scope of the invention. Detailed descriptions of a preferred exemplary embodiment adequate to allow those of ordinary skill in the art to make and use the inventive concepts will follow in later sections.
Various exemplary embodiments are described including a method of routing a total amount of traffic, tij from a source node i, to a destination node j, the method including setting an amount of traffic in one iteration; finding a length for each link e between source node i and destination node j; finding a best intermediate node k; and sending a flow from source node i, to destination node j through intermediate node k.
Various exemplary embodiments are described including a routing device used for routing a total amount of traffic, tij from a source node i, to a destination node j, the device including a memory; and a processor configured to: set an amount of traffic in one iteration; find a length for each link e between source node i and destination node j; find a best intermediate node k; and send a flow from source node i, to destination node j through intermediate node k.
In order to better understand various exemplary embodiments, reference is made to the accompanying drawings, wherein:
To facilitate understanding, identical reference numerals have been used to designate elements having substantially the same or similar structure or substantially the same or similar function.
The description and drawings merely illustrate the principles of the invention. It will thus be appreciated that those skilled in the art will be able to devise various arrangements that, although not explicitly described or shown herein, embody the principles of the invention and are included within its scope. Furthermore, all examples recited herein are principally intended expressly to be only for pedagogical purposes to aid the reader in understanding the principles of the invention and the concepts contributed by the inventor(s) to furthering the art, and are to be construed as being without limitation to such specifically recited examples and conditions. Additionally, the term, “or,” as used herein, refers to a non-exclusive or (i.e., and/or), unless otherwise indicated (e.g., “or else” or “or in the alternative”). Also, the various embodiments described herein are not necessarily mutually exclusive, as some embodiments can be combined with one or more other embodiments to form new embodiments.
Segment routing is a new proposed routing mechanism for simplified and flexible path control in IP/MPLS (Multiprotocol Label Switching) networks. Segment routing builds on existing network routing and connection management protocols and one of its important features is the automatic rerouting of connections upon failure. Re-routing can be done with available restoration mechanisms including Interior Gateway Protocol (IGP)-based rerouting and fast reroute with loop-free alternates. This may be particularly attractive for use in Software Defined Networks (SDN) because the central controller may need only be involved at connection set-up time and failures may be handled automatically in a distributed manner. A significant challenge in restoration optimization in segment routed networks is the centralized determination of connections' primary paths so as to enable the best sharing of restoration bandwidth over non-simultaneous network failures. One may formulate this problem as a linear programming problem and develop an efficient primal-dual algorithm for the solution. One may also develop a simple randomized rounding scheme for cases when there are additional constraints on segment routing. One may demonstrate the significant capacity benefits achievable from this optimized restoration with segment routing.
Segment Routing is envisaged to make possible simplified flexible connection routing in IP/MPLS networks building largely on features of existing network protocols. The main idea in segment routing is to use a sequence of segments to compose the desired end-to-end connection path. The path between each segment's end points is determined by a conventional routing protocol like Open Shortest Path First (OSPF). The segment labels are carried in the packet header and so per flow state is maintained only at the ingress node. A segment label is like an MPLS label and traditional push, pop, swap actions can be applied to it by the routers on the segment path. Segment routing permits finer control of the routing paths and so can be used to distribute traffic for better network utilization. A central controller can exploit the full potential of segment routing by choosing segments based on the traffic pattern to judiciously distribute traffic in the network and avoid local hot-spots. This central control element can be done by a path computation element or in the case of a Software Defined Network (SDN) the SDN controller. There has been some recent work on determining the optimal segment routed paths for improving network bandwidth utilization. While the SDN controller can set up the segments based on measured or predicted traffic, it is not necessarily desirable to involve the controller when there are network failures. One of the key features that segment routing offers is that each segment is routed by the IGP routing protocol and the failure recovery mechanisms of the IGP routing protocol can be used to recover from failures in a distributed manner. Thus an SDN controlled segment routed network can combine the efficiency of centralized control with the fast scalable response to failures that a distributed routing mechanism offers. This distributed restoration assumes that there are sufficient resources in the network to route around network failures. An alternative to an SDN controller is a centralized Path Computation Element (PCE) that plays the same role as an SDN controller. The problem that embodiments address include how to configure the initial segments such that there are sufficient network resources available for rerouting traffic when there are network failures. One may first address the most common practical system when routing is on a single shortest path and the network has to recover from single link failures. One may show how to generalize the approach to handle the case where routing is along Equal Cost Multipaths (ECMP) and the network has to recover from Shared Risk Link Group (SRLG) failures where multiple links can fail at the same time. The key to restoration planning is to share the restoration bandwidth efficiently among independent failures. Embodiments include:
Each of network equipment 105-140 may be connected to an adjacent piece of network equipment 105-140 as pictured. It will be apparent that any configuration of network topology and sequence may be configured including, ring, mesh, star, full connected, bus, tree and line, for example. It will be apparent that fewer or additional pieces of network equipment may exist within exemplary network environment 100. In various exemplary embodiments, network equipment 105-140 may be geographically distributed; for example, network equipment 110, 125, and 130 may be located in Washington, D.C.; Seattle, Wash.; and Tokyo, Japan, respectively. Each piece of network equipment 105-140, may include hardware or software resources for networking including routing capabilities.
Outline of Segment Routing
When there are no segment identifiers, then packets are routed along shortest paths as in standard IGP routing protocols. The other extreme is when each hop is specified in the packet header and this resembles explicit path routing. This fine grained control of the routing path enables the easy deployment of network functions like service chaining where the packet has to pass through a set of middle boxes when it goes from the source to destination. Segment routing can also be used for steering traffic to avoid hot spots in the network and hence improve network utilization. There are two basic types of segments: node and adjacency. A node segment identifies a router node. Node segment IDs are globally unique across the domain. An adjacency segment represents a local interface of a node. Adjacency segment IDs are typically only locally significant on each node. The MPLS data plane can be leveraged to implement segment routing essentially without modification since the same label switching mechanism can be used. Segment labels are distributed across the network using simple extensions to current IGP protocols and hence Label Distribution Protocol (LDP) and Resource Reservation Protocol—Traffic Engineering (RSVP-TE) are no longer required for distributing labels. As a result, the control plane can be significantly simplified. Moreover, unlike MPLS, there is no need to maintain path state in segment routing except on the ingress node, because packets are now routed based on the list of segments they carry. The ingress node should be modified since it needs to determine the path and add the segment labels to the packet. For traffic planning problems where the objective is to route traffic so that no link is overloaded, it is generally enough to consider segment routes with just two segments.
System Model
Segment Routing and Restoration
A major advantage that segment routing offers compared to explicit path routing is that when there are failures in the network, the IGP protocol may recompute the shortest path. Therefore the segments are repaired when there are failures in the network without any intervention. This may be useful, even in SDN networks, since the central controller then does not have to reroute the potentially large number of connections that may have to be rerouted with strict time constraints when there are network failures.
Restoration Requirements
Typical failures considered in optical network restoration, IP/MPLS restoration, and optimization include:
In the case of SRLG failures, multiple links can fail together. SRLG models networks where several logical links share the same physical infrastructure. Since the failure scenarios are independent, in each of these cases, there is potential to share restoration bandwidth among the independent failure scenarios. Single link failures is a scenario of interest in practice and it makes the description of the algorithm simpler. SRLG failures are more complex and subsume node failures.
Mathematical Model
A network may be represented by a graph G=(N,E), where the nodes are the routers connected by directed links. Link e has an IGP link weight w(e) and capacity c(e). One may use n to represent the number of nodes in the network and m to represent the number of links. One may not assume that the network is symmetric. The aggregate amount of traffic between nodes i and j is denoted by tij. Traffic between nodes i and j can be split across multiple paths between i and j. One may assume that this split is flow based. In other words, one may assume that the source node splits the traffic using a hashing scheme that ensures that all packets belonging to the same flow are routed on the same path (thus maintaining packet ordering). One may assume that individual flows are relatively small compared to the total link capacity. This ensures that traffic can be spread arbitrarily between different paths. Assume that the link weights are fixed and all routing is along shortest paths using this link weight as the metric. Let Sij denote the set of links on the shortest path from i to j. Note that when there are multiple shortest paths between nodes, then the network can split traffic across these equal cost paths. Initially, one may assume that there is a unique shortest path between the each pair of nodes. This may be done simply to describe the algorithm. One may show how to extend the restoration algorithm for the case where ECMP is used.
Restoration Planning Problem for Single Link Failures
In some embodiments, one may assume that all flows in the network have to be protected against single link failures. All the IGP link weights are assumed to be given. This implies that the shortest path route between any pair of nodes may be fixed. If there are several alternate shortest paths, then one may assume that one of these paths may be used for routing. Additional embodiments consider the case where equal cost multi-path may be used to route packets. One may denote the set of links in the shortest path between nodes i and j as Sij. If some link f∈Sij fails, then the nodes in the network will recompute the shortest path after eliminating link f and packets will be routed along this new shortest path. Let Bij(f) represent the set of links in the shortest path between nodes i and j when link f fails. Note that some of the links in Sij might be contained in Bij(f). One may use Nij(f)=Bij(f)\Sij to denote the set of new links on which there will be (ij) traffic flow when link f fails.
Computing the Link Load
The traffic on a link can be split into primary traffic and restoration traffic. Primary traffic may be the amount of traffic on die link as a result of routing flows on the link when there are no failures in the network. Restoration traffic may be the traffic that flows on a link due to some failure in the network. Let P(e) denote the primary flow on link e and R(e,f) denote the restoration flow on link e, when link f fails. If link e∈Sik or Skj, then there will be a traffic of xijk that will flow on link e. Therefore, the total amount of primary traffic P(e) on link e will be
The amount of bandwidth reservation for restoration traffic on link e should equal the maximum amount of flow that can result on link e due to the failure of link f in the network. This may be due to the fact that link failures are independent and one may only need to have enough bandwidth to carry traffic in the worst case. In
When e∈Nik(f)∪Nkj(f) and link f fails, then there will be a flow of xijk on link e. Note that if e∈Bik(f)∩Sik or e∈Bkj(f)∩Skj then it is already carrying a flow of xijk that may be routed from i to j through node k (before any failures). Therefore there may not be any additional traffic on link e if link f fails. The amount of restoration traffic R(e,f) on link e if f fails may be given by:
Since the link failures are independent, one may need to ensure that R(e)+maxfR(e,f)≦C(e) for all links e. The routing objective may be to find a set of segments such that the maximum link load under any failure scenario may be as small as possible. One may formulate the problem of determining xijk such that the maximum link utilization may be minimized. In other words, the objective may be to minimize φ such that P(e)+maxfR(e,f)≦φ c(e). Instead of formulating this problem directly, one may introduce a new variable
Instead of scaling the link capacity, one may instead scale up the traffic by a factor of λ. The inverse of the maximum link utilization may be called the throughput. One objective may be to maximize the throughput λ such that when the traffic matrix may be scaled up by λ the resultant primary and restoration traffic still fits in the network. The resultant maximum link utilization may be 1/λ. This formulation that resembles the maximum concurrent flow problem, permits a simple fully polynomial time approximation scheme. Towards this end, one may write the restoration planning problem for the single link failure may be written as the following linear program:
There are O(n3) variables and O(n2+m2) constraints. One may directly solve this linear programming problem. One may develop a simple primal-dual algorithm to solve the problem. One may associate dual variables π(e,f) with the constraint (1) and θij with constraints (2). The dual may be the following linear programming problem:
Given a pair of nodes i and j and a link e∈Sij, one may define
l
ij(e)=Σfπ(e,f)+Σf∈N
to be the length of link e. The running time for this step for each pair (ij) may be O(m). One may now write constraint (3) as
Therefore, for a given source-destination pair (ij), one may set
The best intermediate node for source-destination pair (ij) may be the intermediate node k that achieves the minimum value of θij. Finding the best intermediate node for each pair of nodes involves finding the cost of every link on the shortest path. In the worst case the shortest path can have O(n) links, Therefore, to evaluate the cost of picking a particular intermediate node will take O(nm) time and finding the best intermediate node will take O(n2m) time. One may use π to represent the m×m vector π(e,f). One may define D(π)=Σe c(e)Σf π(e,f) and ρ(π)=Σij tijθij. Note that θij values are a function of π). The dual problem can now be reformulated as the following:
The primal-dual algorithm for solving this problem may be a Fully Polynomial Time Approximation Scheme (FPTAS). In an FPTAS, one may be given an ∈ and the algorithm finds a solution that may be within (1−∈) of the optimal solution in running time that may be a function of the problem parameters and
The algorithm starts off by initializing
for all e, f where δ may be a number that may be computed based on ∈ and the network parameters. All flows are initialized to zero. The algorithm works in multiple phases where each phase comprises of one iteration through each source-destination pair (ij) such that tij>0. One may call each of these source destination pairs with non-zero demand a demand pair. In each iteration corresponding to source-destination pair (ij) traffic may be routed from i to j in multiple steps until a total traffic of tij has been routed. In each step the following computations are done:
This process of finding the best segments and routing flow may be repeated until the termination condition may be met. The running time for each demand pair may be O(n2m) and there may be up to n2 demand pairs, making the running time for each phase O(n4m). Note that this may be an over-estimate. The number of demand pairs could be far less than O(n2) and the cost of each step in the algorithm depends on the length of the shortest path between node pairs which may be typically far less than the upper bound of n.
Handling ECMP and SRLG Failures
The primal dual approach above may be generalized to networks where failures are of a more general nature as well as the case traffic between a source and destination may be split across multiple equal cost paths. In the last section, one may have planned restoration paths in the case where the network has to be resilient to single link failures.
Shared Risk Link Group Failures
In practice, the network may subjected to more serious failures, including node failures. In addition, there can be several links in the network that share physical infrastructure. This results in multiple links failing at the same time when the physical infrastructure fails. The term Shared Risk Link group (SRLG) refers to a set of links that share a risk and can fail at the same time. In the SRLG failure model, each SRLG may be specified as a set of links. A SRLG family may be a collection of subsets of links that can fail at the same time. One may use F to denote a set of links that fail at the same time. One may use F to denote a collection of subset of links. In the case of single link failures, the collection F={{e1}, {e2}, . . . , {em}}. Let E(vj) represent the set of links with vj as one of the endpoints. Then node failures can be represented by the collection of sets F={E(v1), E(v2), . . . . E(vn)}. Unlike link failures where the segment routing headers are unchanged, in the case of node failures (or more generally SRLG failures) SR headers may have to be changed.
Equal Cost Multi-Paths
A routing feature commonly used in networks to distribute load may be Equal Cost Multi-Path (ECMP) routing. In ECMP routing, traffic is split evenly across all minimum cost paths. The split may be done by hashing on the flow ID of the packet to ensure that packets belonging to the same flow are routed on the same path to avoid packet reordering at the destination. When ECMP may be used, it may be easy to figure out apriori the fraction of traffic between any pair of nodes that will be routed on a given link. This information may be enough for one to formulate the Restoration planning problem in networks using ECMP. Let 0≦(e)≦1 denote the fraction of traffic from i to j that goes through link e. In the case of standard shortest path routing, (e)=1 for all e∈Sij and may be zero otherwise. For any given source-destination pair, it may be easy to compute (e) if the IGP link weights are known. Let (e,F) denote the fraction of the traffic from i to j that goes on link e if there may be a failure F in the network. Note that in the SRLG model, F can be multiple links. The primary flow P(e) on link e is
When there is a failure F in the network, the amount of excess flow Δij(e,F) on link e for flows between i and j may be given by:
Δij(e,F)=[(e,F)−(e)]+xij,
P(e)=[(e)+(e)]xijk.
The amount of restoration traffic on link e due to failure F due to the flow xijk may be:
R(e,F)=[Δik(e,F)+Δkj(e,F)]xijk.
One may now formulate the problem of maximizing throughput when the routing uses ECMP and the network may be subjected to SRLG failures as the following linear programming problem:
Note that that the values of Δij(e,F) only depends on the network topology, link IGP metric and whether ECMP may be used. These values may be precomputed. As in the single link failure case, if one may associate dual multipliers π(e,F) with the constraints (1) and θij with constraints (2) one may write the dual to the linear programming problem:
For a given set of π(e,f) values, one may set:
One may set:
as in the single link failure case. The rest of the algorithm follows the same pattern as the algorithm for single link failure. The only difference may be in the running time to compute the best intermediate node. The running time in the SRLG failure case, also depends on the number of elements in the set F in addition to the network size.
It should be apparent from the foregoing description that various exemplary embodiments of the invention may be implemented in hardware and/or firmware. Furthermore, various exemplary embodiments may be implemented as instructions stored on a machine-readable storage medium, which may be read and executed by at least one processor to perform the operations described in detail herein. A machine-readable storage medium may include any mechanism for storing information in a form readable by a machine, such as a personal or laptop computer, a server, or other computing device. Thus, a machine-readable storage medium may include read-only memory (ROM), random-access memory (RAM), magnetic disk storage media, optical storage media, flash-memory devices, and similar storage media.
It should be appreciated by those skilled in the art that any block diagrams herein represent conceptual views of illustrative circuitry embodying the principals of the invention. Similarly, it will be appreciated that any flow charts, flow diagrams, state transition diagrams, pseudo code, and the like represent various processes which may be substantially represented in machine readable media and so executed by a computer or processor, whether or not such computer or processor may be explicitly shown.
Although the various exemplary embodiments have been described in detail with particular reference to certain exemplary aspects thereof, it should be understood that the invention may be capable of other embodiments and its details are capable of modifications in various obvious respects. As may be readily apparent to those skilled in the art, variations and modifications may be affected while remaining within the spirit and scope of the invention. Accordingly, the foregoing disclosure, description, and figures are for illustrative purposes only and do not in any way limit the invention, which may be defined only by the claims.