The present invention relates generally to the data processing field, and more particularly, relates to a method, apparatus and computer program product for implementing InfiniBand (IB) network topology simplification.
Input/output (I/O) networks, such as system buses, can be used for the processor of a computer to communicate with peripherals such as network adapters. However, constraints in the architectures of common I/O networks, such as the Peripheral Component Interface (PCI) bus, limit the overall performance of computers. As a result new types of I/O networks have been introduced.
One new type of I/O network is known and referred to as the InfiniBand (IB) network. The InfiniBand network replaces the PCI or other bus currently found in computers with a packet-switched network, complete with zero or more routers. A host channel adapter (HCA) couples the processor to a subnet, and target channel adapters (TCAs) couple the peripherals to the subnet. The subnet typically includes at least one switch, and links that connect the HCA and the TCAs to the switches. For example, a simple InfiniBand network may have one switch, to which the HCA and the TCAs connect through links.
For an InfiniBand (IB) subnet, the Subnet Manager (SM) is responsible for initial discovery and configuration of the subnet. Tightly coupled with the SM is another InfiniBand component known as the Subnet Administrator (SA). The SA provides services to members of the subnet including access to configuration and routing information determined by the SM.
The capabilities of the SM and SA can be sophisticated: the SM and SA resolve all potential paths from all nodes with deadlock avoidance, the SM and SA support many optional features of the InfiniBand Architecture (IBA), the SM and SA provide quality of service (QOS) support, and the like.
Alternatively, capabilities of the SM and SA may be simplistic: the SM and SA only resolve simple shortest paths between nodes, only implement mandatory IBA functions, and provide no QOS support.
In an open heterogeneous environment with multiple vendors attached to the same subnet with little or no restriction on which vendors participate, or in a closed homogeneous environment restricted to a limited, controlled number of vendors, there is often a need to support the SMs and SAs from different vendors with different levels of sophistication. In order to support a wide variety of the SM and SA capabilities a subnet configuration should present to the SM and SA a simple or trivial subnet configuration.
Some hardware implementations by their nature create a nontrivial subnet. This is often because of requirements to reduce the number of external cables in a subnet, to preserve legacy implementations and existing software/firmware support, to provide additional fan-out behind a switch, to provide additional RAS capability, and the like.
One pervasive RAS requirement for the enterprise computing space is the requirement to provide redundant independent paths from one node in a fabric to another node to allow failover from one path to another. In addition, it is generally expected the failover will be fast and nondisruptive to the upper layers of a system.
Fast nondisruptive failover is provided by InfiniBand through a capability know as Auto Path Migration (APM). Because of hardware requirements for features such as fast nondisruptive failover with redundant independent paths, often provided in combination with other requirements listed above, the SM and SA must provide advanced and optional features and potentially require application specific customization. Hardware implementations that create nontrivial subnets; and therefore require a sophisticated, potentially customized, SM and SA; significantly reduce their market opportunities.
A need exists for an effective mechanism for implementing InfiniBand (IB) network topology simplification.
Principal aspects of the present invention are to provide a method, apparatus and computer program product for implementing InfiniBand (IB) network topology simplification. Other important aspects of the present invention are to provide such method, apparatus and computer program product for implementing InfiniBand (IB) network topology simplification substantially without negative effect and that overcome many of the disadvantages of prior art arrangements.
In brief, a method, apparatus and computer program product are provided for implementing InfiniBand (IB) network topology simplification. A Subnet Manager (SM) of an IB subnet sends a subnet discovery request to a switch requesting the number of ports that are attached to the switch. Each of the switches and target channel adapters (TCAs) includes a Subnet Management Agent (SMA). The receiving switch Subnet Management Agent (SMA) responds to the SM indicating a sufficient number of ports on the switch to support at least one port for each TCA within the subnet. Each TCA supports at least two local IDs (LIDs).
In accordance with features of the invention, the SM assigns at least two local IDs (LIDs) to each TCA. The SMA updates physical TCA hardware with the assigned LIDs for the TCA ports.
The present invention together with the above and other objects and advantages may best be understood from the following detailed description of the preferred embodiments of the invention illustrated in the drawings, wherein:
In an InfiniBand (IB) subnet, a Subnet Manager (SM) is responsible for initial discovery and configuration of the subnet. Another InfiniBand component is known as the Subnet Administrator (SA) that provides services to members of the subnet including access to configuration and routing information determined by the SM. As used in the following specification and claims, the term Subnet Manager (SM) should be understood to include the Subnet Administrator (SA).
In accordance with features of the preferred embodiments, methods are provided for implementing InfiniBand (IB) network topology simplification. This invention takes what would be a complex subnet and presents it to the SM as a simple subnet.
Having reference now to the drawings, in
IB subnet 200 includes a host channel adapter (HCA A) 202 with a pair of IB ports W, X, 204, an external switch (switch B) 206, and a pair of IB ports Y, Z, 208, a plurality of embedded switches (switches C, D, E) 210 and a plurality of target channel adapters (TCAs F, G, H) 212 within an enclosure or drawer I, 214. The host channel adapter (HCA A) 202 couples a processor (not shown) to the IB subnet 200. The target channel adapters, (TCAs F, G, H) 212 within the drawer I, 214, couple peripherals (not shown) to the IB subnet 200.
It should be understood that the present invention is not limited to the switches and TCAs arranged within an enclosure as shown in accordance with the preferred embodiment, various other implementations are possible where the SMAs for the switches and TCAs are able to coordinate the processing of SM subnet discovery and configuration requests.
A first pair of point-to-point links, LINK 1, LINK 2 connects respective IB ports W, X 204 with the external switch B, 206. A second pair of point-to-point links, LINK 3, LINK 4 connects respective IB ports Y, Z, 208 with the external switch B, 206. Each of the embedded switches C, D, E, 210 is at least a three port switch.
Each of the switches C, D, E, 210, and TCAs F, G, H, 212 within the drawer includes a Subnet Management Agent (SMA) arranged for implementing InfiniBand (IB) network topology simplification in accordance with the preferred embodiment.
Redundant independent paths are needed within IB subnet 200. For example, with the configuration of IB subnet 200 as shown in
A significant problem with this configuration typically results because a simple SM will only configure the shortest paths between two node ports. For the configuration in
In accordance with features of the preferred embodiments, key elements include the following:
The SMA component for the nodes, switches and TCAs, in the drawer coordinates their responses to the SM in order to present a representation of the drawer topology that is different from what is physically inside the drawer.
The simple switches in Drawer I, 214, such as the illustrated Switches C, D, E, 210 in
The TCAs must support at least two LIDs.
In
Referring to
Referring now to
Next as indicated in a block 406, the SM assigns LIDs to the TCA ports attached to Switch C, 210. Then the SMAs coordinate and update the physical TCA hardware with the appropriate LIDs as indicated in a block 408. As a result the physical routing works even though the actual physical hardware does not match the SM's view of the subnet topology.
With the appropriate LIDs assigned, when a packet arrives at switch C, 210 for LID 300 in
Then the exemplary steps are repeated when the next switch SMA is identified as indicated in a decision block 410. After the SM performs subnet discovery for each switch SMA, then the sequential operations return as indicated in a block 412.
Referring now to
If the switch in the drawer is an InfiniBand Architecture compliant switch with linear forwarding table support, or any very simple, the switch checks a packet received on a port with the two Local IDs (LIDs) assigned by the SM to the TCA directly attached to the switch as indicated in a decision block 504. If a match is found with one of the TCAs LIDs, the switch routes the packet to the TCA as indicated in a block 506. If the packet LID does not match one of the TCAs LIDs, the packet is sent out the other switch port to the next switch as indicated in a block 508. After the packed is routed to the TCA at block 506, or sent out the other switch port at block 508, then the sequential operations return as indicated in a block 510.
In brief, a significant advantage of method of the invention is that a very simple switch can be embedded within a TCA chip and multiple TCA chips can be cascaded in a drawer, requiring fewer physical cables and expensive external switches, without overly complicating the SM's view of the subnet while maintaining architecture compliance. This ability to manipulate the view presented to the SM allows for greater flexibility in hardware designs to allow for optimizations in performance and reliability without complicating the topology as viewed by the SM.
Referring now to
A sequence of program instructions or a logical assembly of one or more interrelated modules defined by the recorded program means 604, 606, 608, 610, direct the IB subnet 200 for implementing InfiniBand (IB) network topology simplification of the preferred embodiment.
While the present invention has been described with reference to the details of the embodiments of the invention shown in the drawing, these details are not intended to limit the scope of the invention as claimed in the appended claims.