1. Field of the Invention
The present invention relates generally to networks. Particularly, the present invention relates to dynamic selection of routing paths.
2. Description of the Related Art
Storage networks can comprise several Fibre Channel switches interconnected in a fabric topology. These switches are interconnected by a number of inter-switch links (ISLs), which carry both data and control information. The control path provides connectivity among the switching elements and presents the fabric as a single entity to the end devices. The data paths provide throughput to the end devices connected to the fabric.
Paths available between a pair of switches within a fabric are determined during system initialization and/or during changes in fabric configuration, such as the addition or removal of a switch. In a typical network, more than one path may be available to transmit frames between a source-destination pair. This can allow the source switch to potentially distribute frame traffic, destined for the same destination switch, over two or more such paths. In some cases, the source switch may distribute traffic over multiple paths that have equal associated cost, e.g., the same number of hops from source to destination. These multiple paths with equal costs may have unequal bandwidths associated with them.
Traditional schemes of distributing traffic over multiple paths rely on the modulo N method, where N is the number of multiple paths. For example, if the source switch selects four multiple paths over which to distribute traffic destined to the same destination switch, the modulo N scheme will equally distribute traffic over the four available paths. Traffic may be distributed on per packet/frame basis. In such cases, traffic is distributed such that the packets/frames destined for the same destination are evenly distributed over the multiple paths. Traffic may also be distributed on the basis of exchanges or transaction. In such cases, the number of exchanges/transactions between the source and the destination are evenly distributed over the available multiple paths.
The modulo N scheme is usually implemented by performing a modulo N operation on one parameter or a combination of parameters of a frame and determining the result. Because the set of possible outputs of a modulo N operation, assuming N is an integer, is integers 0 to (N−1), each member of the set can be assigned to a path between the source and destination pair. So, for example, if there are three possible paths between a source and destination, a modulo 3 operation on a frame will result in values of 0, 1, or 2. Therefore, 0 can be assigned to path 1, 1 can be assigned to path 2, and 2 can be assigned to path 3. If the modulo 3 operation on a frame results in 0, that frame is sent via path 1; if the modulo 3 operation on another frame results in 1, then that frame is sent via path 2; and so on.
However some traditional methods tend to treat each path equally, regardless of their respective bandwidths, latencies, congestions, etc. For example, if three paths are available between a source and a destination, and the effective bandwidths of the three paths are 17 Gbps, 8 Gbps, and 1 Gbps, the modulo N scheme (modulo 3, in this example) will distribute frames evenly over the three paths. Therefore, links with higher bandwidths—17 Gbps and 8 Gbps—may remain underutilized or the lower bandwidth link—1 Gbps—may become overutilized. Other traditional methods take bandwidth into account, but the granularity in distribution of traffic is still undesirable.
Thus a path selection scheme is needed that can more flexibly distribute traffic over multiple paths between a source-destination pair.
Fabrics and networks having multiple interconnected switches can provide multiple paths between a source and destination device pair. A switch or router can carry out load balancing by distributing traffic over the available multiple paths. Each of the multiple paths can have different bandwidths, latencies, congestions, etc. associated with them. Furthermore, two or more of the multiple paths can have equal costs associated with them.
The switch uses a load balancer circuit to distribute frames over multiple paths. The load balancer circuit includes a hash function that generates a hash value of parameters of a given frame. For example, the hash function generates a hash value of a concatenation of the source ID, destination ID, and exchange ID of the frame. The hash function can be a CRC, cryptographic hash function, randomizing function, etc. The generated hash value can be input to a plurality of range comparators. Each range comparator has a range of values associated with it. The ranges associated with the range comparators can be non-overlapping. If the generated hash value falls within the range of values associated with a range comparator, that range comparator generates an “in-range” signal. The total number of range comparators can be equal to a number of multiple paths between the source and destination pair.
Outputs of all range comparators are fed to a path selection module. The path selection module includes information associating the range comparators to multiple paths available between the source and destination pair. If a range comparator returns a in-range signal, the path selector determines the associated path. The frame is then outputted from the appropriate output port that transmits the frame via the determined path.
The sizes of the ranges associated with the range comparators can be unequal. A relatively larger size of the range of a range comparator associated with a path will result in relatively more traffic transmitted via that path. This can allow the router to asymmetrically distribute traffic over a given set of multiple paths by varying the size of the ranges of the range comparators. Thus the router can balance the load over the multiple paths based on various criteria such as available bandwidth, latencies, congestions, etc., associated with each path. Further, if repetitive network traffic results in relatively close hash values, the ranges can be altered to provide a desired balance.
The present invention has other advantages and features which will be more readily apparent from the following detailed description of the invention and the appended claims, when taken in conjunction with the accompanying drawings, in which:
A variety of devices can be connected to the fabric 102. A Fibre Channel fabric supports both point-to-point and loop device connections. A point-to-point connection is a direct connection between a device and the fabric. A loop connection is a single fabric connection that supports one or more devices in an “arbitrated loop” configuration, wherein signals travel around the loop through each of the loop devices. Hubs, bridges, and other configurations may be added to enhance the connections within an arbitrated loop.
On the fabric side, devices are coupled to the fabric via fabric ports. A fabric port (F_Port) supports a point-to-point fabric attachment. A fabric loop port (FL_Port) supports a fabric loop attachment. Both F_Ports and FL_Ports may be referred to generically as Fx_Ports. Typically, ports connecting one switch to another switch are referred to as expansion ports (E_Ports). In addition, generic ports may also be employed for fabric attachments. For example, G_Ports, which may function as either E_Ports or F_Ports, and GL_Ports, which may function as either E_Ports or Fx_Ports, may be used.
On the device side, each device coupled to a fabric constitutes a node. Each device includes a node port by which it is coupled to the fabric. A port on a device coupled in a point-to-point topology is a node port (N_Port). A port on a device coupled in a loop topology is a node loop port (NL_Port). Both N_Ports and NL_Ports may be referred to generically as Nx_Ports. The label N_Port or NL_Port may be used to identify a device, such as a computer or a peripheral, which is coupled to the fabric.
In the embodiment shown in
Switches S1110, S2112, S3114, S4116, S5118, S6, 120, and S7122 are connected with one or more inter-switch links (ISLs). Switch S1110 can be connected to switches S2112, S7122, and S6120, via ISLs 180a, 180b, and 180c, respectively. Switch S2112 can be connected to switches S3114 and S7122 by ISLs 180d and 180e. Switch S3114 can be connected to switches S4116 and S5118 via ISLs 180f and 180g, respectively. Switch S4116 can be connected to switch S5118 via ISL 180h. Switch S5118 can be connected to switches S6120 and S7122 via ISLs 180i and 180j. Note that although only single links between various switches have been shown, links between any two switches can include multiple ISLs. The fabric can use link aggregation or trunking to form single logical links comprising multiple ISLs between two switches. For example, if 180a comprised of three 2 Gbps ISLs, the three ISLs can be aggregated into a single logical link between switches S1110 and S2112 with a bandwidth equal to the sum of bandwidth of the individual ISLs, i.e. 6 Gbps. It is also conceivable to have more than one logical links between two switches where each logical link is composed of one or more trunks. The fabric 102 with multiple switches interconnected with ISLs can provide multiple paths with multiple bandwidths for devices to communicate with each other.
For example, referring to
The switch 200 can also include a router module 215 for routing incoming frames at input ports to appropriate output ports. The router can also include a load balancer circuit or module 219 that determines a single path from a selected set of multiple paths that a frame can take from the switch 200 to a destination switch in the fabric. The router module 215 can control the switch construct 213 such that frames are routed to the appropriate output interface.
The frame ID 305 is input to a hashing module 307. Hashing functions are well known in the art, and hashing module 307 can include any one of the well known hash functions, e.g., cryptographic hash functions, CRC, randomizing functions, etc. In
The computed hash value 309 can be fed to range comparators 310a-d. Each range comparator can represent a range that corresponds to a single path among multiple paths from the source switch to the destination switch. For example, range comparators 310a-d can represent multiple paths between switches S1110 and S5118 (
Additionally, if only limited items are used to develop the frame ID 305, such as just SID and DID, and if the majority of the traffic is between a limited number of source-destination pairs, in certain cases the hash values may differ only be small amounts. If a more normal spacing were to be used, this could result in an imbalanced situation. By having the flexibility to set the ranges, this situation can be solved by narrowing the ranges until the loads are balanced to a desired amount.
Because of non-overlapping ranges of range comparators 310a-d, the hash value will fall within the range of only one range comparator. For example, assuming the exemplar ranges of the range comparators stated above, and assuming the exemplar hash value 309 is 400, range comparator 310a will be the only range comparator that will generate a “in-range” signal indicating that the hash value 309 is within its range. The path selection module 313 detects which one of the range comparators 310a-d has generated an in-range signal. The path selector module 313 maintains information matching each of the range comparators 310a-d outputs to its associated path. Therefore, if range comparator 310a generates an in-range signal, the path selection module 313 will determine the associated path, and send a signal to the switch construct 213 (
The load balancing module 219 shown in
Note that because the subtracters 353 and 355 are signed integer subtracters, the MSB of the output of each subtracter will result in a bit value of ‘1’ whenever the result of the subtraction is a negative number and will result in a bit value of ‘0’ whenever the result of the subtraction is a positive number. Therefore, the NOR gate 361 of the range comparator 351 will output a 1 if and only if the hash value 309 lies within the range L-H (including values L and H). This ‘1’ signal can represent an in-range signal that is fed to the path selection module 313. All other range comparators, for which the hash value 309 lies outside their specified range, at least one of the inputs to their respective NOR gates will be a ‘1’, resulting in a ‘0’ NOR gate output.
In a second embodiment, the range comparator 351 can include an XOR gate instead of the NOR gate 361 in addition to the subtracter 353 subtracting H from the hash value 309 (instead of subtracting the hash value 309 from the H). In this embodiment, it should be recognized that if the hash value 309 is equal to the upper limit value H of a range, then the MSB values of both the subtracters will be ‘0’. For example, if the range for a comparator has been defined as L=100 and H=200, and the hash value 309 is 200, then both the subtracters (hash value −H, and hash value −L) will have the values of their MSBs equal to ‘0’. Therefore, the output of the XOR gate will also be ‘0’ despite the hash value being within the range of the range comparator. However, this can be easily mitigated by increasing the upper limit to be an incremental value higher than 200, say 201. This value can be equal to the L value of the subsequent range. For example, the ranges can be 0-101, 101-201, 201-301, and so on. This ensures that even if the hash value is equal to a high value (e.g. 101) at least one range comparator (e.g., range 101-201) will have its XOR gate output as ‘1’.
In the preferred embodiment there are sixteen range comparators, that value selected as a balance between logic size and the probable number of paths available. If fewer than sixteen range comparators are needed, the unneeded units can be disabled.
Alternatively, the range comparators 310a-d and or the path selection module 313 can be implemented as look up tables stored within memory 211 (
Although the router module 215 and the path selector 217 have been shown as separate modules, their functionality can be combined within a single entity. For example, the functionality of path selector 217 can be included in the router module 215. Furthermore, the path selector 217 and router module 215 can share functionality. For example, the load balancer can be part of the path selector 217 while the router module 215 uses the output of the load balancer to determine the appropriate output port for the frame.
Information regarding load balancing can be distributed to one or more switches within the fabric or network. For example, all of the switches S1-S7 can include identical load balancing information such as ranges of the range comparators. By distributing this information, intermediate switches can configure their routing tables in accordance to the paths specified. Thus, a path selected for a frame at the source switch can be consistently selected by every intermediate switch when the intermediate switch encounters the same frame.
The above description is illustrative and not restrictive. Many variations of the invention will become apparent to those skilled in the art upon review of this disclosure. The scope of the invention should therefore be determined not with reference to the above description, but instead with reference to the appended claims along with their full scope of equivalents.