The present disclosure relates to routing traffic through a network, and in particular, routing traffic over equal cost paths through a network.
In highly redundant networks there often exist multiple paths between a pair of network elements or nodes. Routing protocols, including link state protocols, can identify these multiple paths and are capable of using equal cost multi-paths for routing packets between these pair of nodes.
In order to accommodate bandwidth disparity between equal cost paths, the equal cost paths may be supplemented through the use of unequal cost multi-path routing. Other systems are simply ignorant of the bandwidth disparity between the equal cost paths, and therefore, traffic is distributed equally over the equal cost paths. In such cases traffic forwarding is agnostic to a path's bandwidth capacity.
A plurality of equal cost paths through a network from a source node to a destination node are determined. A maximum bandwidth capacity for each link of each of the plurality of equal cost paths is determined, and a smallest capacity link for each of the plurality of equal cost paths is determined from the maximum capacity bandwidths for each link. An aggregated maximum bandwidth from the source node to the destination node is determined by aggregating the smallest capacity links for each of the plurality of equal cost paths. Traffic is sent from the source node along each of the plurality of equal cost paths according to a value of a capacity for the smallest capacity link for each of the plurality of equal cost paths, wherein a total of the sent traffic does not exceed the aggregated maximum bandwidth, and traffic sent along each of the plurality of equal cost paths does not exceed the smallest maximum bandwidth for respective equal cost paths.
Depicted in
According to the example of
Yet, as illustrated by dashed links 145a, 145d and 145f, each of the above described paths may not be able to handle the same amount of traffic. For example, links 145a and 145f may support a bandwidth of 40 GB each, links 145b, 145c, 145h and 145e may support a bandwidth of 20 GB each, and link 145d is only able to support a bandwidth of 10 GB. Bandwidth-weighted computation unit 135 can use this information to determine an aggregated or unconstrained bandwidth from root 105 to destination 140. This aggregated or unconstrained bandwidth is the maximum amount of traffic that can be sent from root 105 to destination 145 over the above-described equal cost paths. In this case, the aggregated or unconstrained bandwidth will be the aggregation of the smallest bandwidth link for each of the equal cost paths. Accordingly, the aggregated or unconstrained bandwidth for traffic between root 105 and destination 140 will be 70 GB (20 GB+10 GB+20 GB+20 GB).
Bandwidth-weighted path computation unit 135 also sends traffic according to the ratio of the lowest bandwidth link in each path, traffic will be sent over paths A, B, C, and D in the ratio of 2:1:2:2. In other words, traffic is sent according to the smallest capacity link of each of the equal cost paths. If root 105 has 70 GB of traffic to send, 20 GB will be sent over path A, 10 GB will be sent over path B, 20 GB will be sent over path C, and 20 GB will be sent over path D. If root 105 has 35 GB of traffic to send, 10 GB will be sent over path A, 5 GB will be sent over path B, 10 GB will be sent over path C, and 10 GB will be sent over path D. By splitting the traffic according to this ratio, root 105 is capable of fully utilizing the resources of network 100 without accumulating dropped packets at an over-taxed network link.
Absent bandwidth-weighted path computation unit 135, root 105 may send traffic over network 100 in a manner which results in dropped packets, or which inefficiently utilizes network resources. For example, if root 105 splits 60 GB of traffic equally between each of the paths, packets will likely be dropped by link 145d. Specifically, equally splitting the traffic between the four paths will result in 15 GB being sent over each path. Accordingly, link 145d will be tasked with accommodating 15 GB of data when it only has the bandwidth to accommodate 10 GB. This shortfall in available bandwidth may result in packets being dropped at node 110. Alternatively, if root 105 limits its transmission rate to that of the lowest bandwidth link, mainly link 145d, it will underutilize all of the other links in network 100. Specifically, network 100 will be limited to a maximum transmission bandwidth of 40 GB between root node 105 and destination node 140, when it is actually capable of transmitting 70 GB.
With reference now made to
In 210, a maximum bandwidth capacity for each link of each of the plurality of equal cost paths is determined. This determination may be made in response to receiving an LSP message from the nodes in the network which comprise the equal cost paths determined in 205. In step 215, a smallest capacity link in each equal cost path is determined from the maximum capacity bandwidths determined in step 210. In 220, an aggregated maximum bandwidth from the source node to the destination node is determined by aggregating the smallest capacity links for each of the plurality of equal cost paths.
In 225, traffic is sent from the source node along each of the plurality of equal cost paths according to a value of a capacity for the smallest capacity link for each of the plurality of equal cost paths, wherein a total of the sent traffic does not exceed the aggregated maximum bandwidth, and traffic sent along each of the plurality of equal cost paths does not exceed the smallest maximum bandwidth for respective equal cost paths. Specific examples of the determination of the smallest maximum bandwidth link and the sending of the traffic according to the value of the smallest maximum bandwidth link will be described in greater detail with reference to
With reference now made to
Upon receiving the bandwidth information, the source node 305, or another device such as a path computation element, will determine the lowest bandwidth capacity link in each path. For example, when the path from node 305 to node 310 through nodes 315, 320 and 325 is evaluated, it will be determined that the lowest bandwidth capacity link in the path is link 345b, which has a bandwidth value of 10 GB. Accordingly, the maximum bandwidth that can be sent over this path is 10 GB. In other words, link 345b is the minimum link, and therefore, limits the traffic through the path
In order to determine which link has the lowest bandwidth capacity, a flow matrix may be employed. The flow matrix stores values for total bandwidth that can be sent over a link for a particular destination node, in order to determine the minimum bandwidth path. Through the process that will now be described with reference to
Initial flow matrix 350a is originally populated by creating a matrix that only contains vertices and edges which are used in the previously determined equal cost paths. The vertices are sorted by hop count, i.e. how far they are from the root or source node. The value for the root vertex or source vertex is given an infinite capacity, while all other vertices or nodes are marked with a capacity of 0. This results in the initial flow matrix 350a. It is noted that the empty spaces in the flow matrix represent links which are not used to reach a particular node. For example, the link 345f is left blank for nodes 315, 320 and 330 because it is not used to send data to these nodes.
With the initial flow matrix 350a populated, a link with the lowest hop count to the destination is selected. In the simplest case, the link 345a when node 315 is the ultimate destination will be considered. Here, the value in the flow matrix, in this case, shown at entry 365, will be populated according to the following expression:
minimum of (capacity of parent vertex, capacity of link); exp. 1.
In other words, the value will be populated with the lesser of the value at 370 or the bandwidth capacity of the link 345a. In this case, expression 1 would read:
minimum of (∞, 40);
The 40 GB capacity of link 345a is less than the infinite capacity of root or source node 305, and therefore, in the final flow matrix 350b, value 355 has a value of 40. Normally, this value would then be back propagated to previous links in the path, but in the present case, only a single link is used to reach node 315.
Taking the slightly more complicated case of using node 320 as the ultimate destination, the process would begin in the same way. First, the process would begin by determining a value for entry 375. Since this presents the same scenario as populating entry 365, entry 375 would initially be populated with a value of 40. Once the value of 375 is determined, a value for entry 380 will be determined. In this instance, expression 1 for entry 380 would read:
minimum of (40, 10);
This is because the capacity for the parent vertex is 40 GB, and the capacity for the present link is 10. Accordingly, in final flow matrix 350b, entry 385 has a value of 10. In this case, there is a subsequent link to propagate back through; therefore, in final flow matrix 350b, entry 360 also has a value of 10 as the value of 385 is propagated back to entry 360. This process will work in an analogous manner for path from node 305 with node 330 as an ultimate destination, and for the path from node 305 with node 335 as an ultimate destination.
The process described above becomes more complicated when node 325 is used as the ultimate destination. Initially, the process would begin in the same way. But, when the value for entry 390 is calculated, expression 1 would read:
minimum of (10, 40);
Here, the link 345c can handle 40 GB, but it will be limited to the value of 10 GB for link 345b. The capacity for the parent vertex will be 10 GB due to the process described above for populating entry 385. Accordingly, the entry for 390 is repopulated with 10, as illustrated by entry 395 in final flow matrix 350b. Yet, this fails to account for the full capacity that may be sent from node 305 to node 325. Specifically, traffic can also be sent from node 305 to node 325 over the path comprising links 345e, 345f and 345g, as illustrated by values for all of these links in column 396 of final flow matrix 350b. Therefore, node 325 can receive 30 GB of traffic from node 305, 10 GB from the path including link 345c and 20 GB from the path including link 345g.
In other words, when back propagating from node 325, the path splits, with some of the traffic having come from node 335 and some of the traffic having come from node 320. Specifically, the capacity of the parent nodes 320 and 335 are taken into consideration when back propagating. Accordingly, the capacity of node 320 is back propagated along its path, and the capacity of node 335 is propagated along its path. This ensures that neither link becomes overloaded, but traffic sent to node 325 is still optimized for the total amount of traffic that can be sent over the two paths. The process used to make these determinations can utilize a temp variable for each parent node in order to remain aware of the parent capacity.
The process described above also becomes more complicated for a final destination of node 310. This is because link 345d only has a capacity of 25 GB, meaning it can handles less than the 30 GB capacity that can be sent to node 325. In other words, even though the path containing node 315 can send 10 GB, and the path containing node 330 could handle 20 GB, when these two paths merge at node 325, they will be limited by the capacity of the merged linked 345d. In order to determine how much traffic should sent over the path that includes link 345c versus the path that includes link 345g, a water-filling process may be used. Specifically, each of the paths will be “filled” until they reach their saturation level. By splitting the traffic in this way, 10 GB of traffic would be sent over the path that includes link 345c, and 15 GB would be sent over the path that includes path 345g. In other words, the paths will receive equal amounts of traffic until path 345c reaches its limit of 10 GB, and the path that includes 345g will receive the remainder of the traffic. According, column 397 of final flow matrix 350b illustrates this split.
Once flow matrix 350b is populated, a final determination of how much traffic can be sent to each node is determined, and illustrated in
Furthermore, when less than the full capacity is to be sent to any of nodes 325 and 310, the amount of traffic sent over each path may be sent in the ratio of the capacities illustrated in final flow matrix 350b. For example, if only 3 GB are to be sent to node 325, 1 GB will be sent over the path containing link 345c, and 2 GB will be sent over the path containing link 345g. This is because the ratio over each path is 1:2 (i.e., 10 GB to 20 GB as illustrated in column 396 of final flow matrix 350b). If 3 GB are to be sent to node 310, 1.2 GB will be sent over the path containing link 345c while 1.8 GB will be sent over the path containing link 345g (i.e., a ratio of 2:3, or 10 GB to 15 GB).
With reference now made to
In order to appropriately back propagate the correct value for links 450f and 450e, a temporary (temp) variable is used to store the value for intermediate nodes, in this case, 20 GB for node 440, and 10 GB for node 445. Specifically, node 450f is a merged link, from which two paths split. When node 435 is reached, the values in the temp variable are added together, and this value is back propagated along the rest of the path to root node 405. This is illustrated in column 460 of flow matrix 455. As can be seen in flow matrix 455, the links prior to node 435 (in the back propagation direction) have values of 10 and 20 GB, respectively. The links after node 435 (in the back propagation direction) have the 30 GB of capacity, the sum of 10 and 20 GB. In other words, even though the capacity of the merged link 435f is greater than or equal to a sum of the capacities of the smallest capacity link for each of the split paths, the traffic sent over link 450f is limited to the sum of the capacities of link 450g and 450i. Accordingly, even though 450f is a 40 GB link, when traffic is sent to node 410, the traffic sent over link 450f is limited to 30 GB, as indicated in the valued for link 450f in column 460 of flow matrix 455.
Once flow matrix 455 is populated, a final determination is made for how much traffic can be sent to each node, as illustrated in
With reference now made to
With regard to the traffic that can be sent to node 425, when node 425 is now the ultimate destination of the traffic, 80 GB of traffic can be sent. Forty GB of the traffic can be sent over links 450a and 450b and 450c, and an additional 40 GB of traffic can be sent over links 450e, 450f and 550k.
With regard to the traffic sent to node 410, node 410 will still be limited to receiving 50 GB of traffic given that link 450d is a 20 GB link, link 450g is a 20 GB link, and link 450i is a 10 GB link. Yet, because traffic can reach node 425 from two paths, and node 435 is along the path for the traffic traversing node 425, node 440 and node 445, the amount of traffic sent through these nodes will be altered. Specifically, because node 435 provides traffic to nodes 425, 440 and 445, the amount of traffic initially sent to node 435 over link 450e is now increased from 30 GB to 40 GB. Similarly, the traffic sent over link 450f is also increased from 30 GB to 40 GB. On the other hand, because the traffic to link 410 is still limited to 50 GB, the traffic sent over links 450a, 450b and 450c is now limited to 10 GB.
With reference now made to
Next, a flow capacity matrix 650 is formed, according to the following rules:
C[u,v,w] ={Bandwidth of link between u and v if link <u,v> appears in any ECMP path between root node and w, Else it is set as 0};
where u and v are two nodes connected by an edge or link in network 600, and w is the destination node.
A dummy node call “D” is also added to the matrix where all nodes except for the root are connected to this dummy node D. The capacity of each of these new links is infinite. A flow matrix populated according to these rules appears in
Next, a function F(u,v,w) is defined to be the amount of traffic sourced from the root node to destination node w flowing between link <u,v> in the u to v direction. The following constraints are applied to this function:
Capacity Constraints: For all nodes u,v in the graph:
Flow Conservation: For any node except for root and dummy-sink-node:
Skew Symmetry: For all nodes u,v, for all destinations w,
With these constraints in place, the function F is optimized so that F(root_node, v ,w) is maximized for all destinations w and for each neighbor v of root-node.
Specifically, with the matrix determined, it can be run through a linear programming process, such as Simplex, to solve for the flow on each link per destination. This will simultaneously solve for all destinations. Alternately, the matrix can be run through a standard max-flow network process on a pre-destination basis, such as the Ford & Fulkerson method.
Upon solving for the above model, the flow matrix 650a
Solving the flow matrix to conform with the above-defined rules gives an optimal per link per destination flow value. From the root node perspective, flow matrix 650b gives a weighted ratio for traffic sent from a node to its neighbor based on the destination node. This can then be used for bandwidth weighted ECMP routing.
Referring now to
Memory 740 may comprise read only memory (ROM), random access memory (RAM), magnetic disk storage media devices, optical storage media devices, flash memory devices, electrical, optical or other physical/tangible (e.g. non-transitory) memory storage devices. Thus, in general, the memory 740 may comprise one or more tangible (non-transitory) computer readable storage media (e.g., a memory device) encoded with software comprising computer executable instructions. When the software, e.g., bandwidth-weighted path computation software 745 is executed (by the processor 720), the processor is operable to perform the operations described herein in connection with
In summary, a method is provided comprising: determining a plurality of equal cost paths through a network from a source node to a destination node; determining a maximum bandwidth capacity for each link of each of the plurality of equal cost paths; determining a smallest capacity link for each of the plurality of equal cost paths from the maximum capacity bandwidths for each link; determining an aggregated maximum bandwidth from the source node to the destination node by aggregating the smallest capacity links for each of the plurality of equal cost paths; and sending traffic from the source node along each of the plurality of equal cost paths according to a value of a capacity for the smallest capacity link for each of the plurality of equal cost paths, wherein a total of the sent traffic does not exceed the aggregated maximum bandwidth, and traffic sent along each of the plurality of equal cost paths does not exceed the smallest maximum bandwidth for respective equal cost paths.
Similarly, an apparatus is provided comprising: a network interface unit to enable communication over a network; and a processor coupled to the network interface unit to: determine a plurality of equal cost paths through the network from a source node to a destination node; determine a maximum bandwidth capacity for each link of each of the plurality of equal cost paths; determine a smallest capacity link for each of the plurality of equal cost paths from the maximum capacity bandwidths for each link; determine an aggregated maximum bandwidth from the source node to the destination node by aggregating the smallest capacity links for each of the plurality of equal cost paths; cause traffic to be sent from the source node along each of the plurality of equal cost paths according to a value of a capacity for the smallest capacity link for each of the plurality of equal cost paths, wherein a total of the sent traffic does not exceed the aggregated maximum bandwidth, and traffic sent along each of the plurality of equal cost paths does not exceed the smallest maximum bandwidth for respective equal cost paths.
Further still, one or more computer readable storage media encoded with software comprising computer executable instructions and when the software is executed operable to: determine a plurality of equal cost paths through a network from a source node to a destination node; determine a maximum bandwidth capacity for each link of each of the plurality of equal cost paths; determine a smallest capacity link for each of the plurality of equal cost paths from the maximum capacity bandwidths for each link; determine an aggregated maximum bandwidth from the source node to the destination node by aggregating the smallest capacity links for each of the plurality of equal cost paths; and cause traffic to be sent from the source node along each of the plurality of equal cost paths according to a value of a capacity for the smallest capacity link for each of the plurality of equal cost paths, wherein a total of the sent traffic does not exceed the aggregated maximum bandwidth, and traffic sent along each of the plurality of equal cost paths does not exceed the smallest maximum bandwidth for respective equal cost paths
The above description is intended by way of example only. Various modifications and structural changes may be made therein without departing from the scope of the concepts described herein and within the scope and range of equivalents of the claims.