The invention relates to connectors for cables, to cables including such connectors and to systems having cable interconnects.
In a system having cable interconnects, the identification of local and remote cable connectivity is important for configuration and diagnosis of the cable-based interconnect system. This can include the determination of whether a cable is connected locally, whether it is connected at the remote end or ends (remote end for a single cable or remote ends for a split- or multi-link cable) and where the remote end(s) is (are) connected.
For example, with standard InfiniBand cables, the above issues can only be resolved if the link is able to train, and thereby to allow in-band packet traffic between the associated end-points. Hence, if the link is not able to train, then all the states associated with all the above aspects are in principle un-defined.
In order to address this, various techniques for local cable/connector presence detection, cable connectors with electrically readable FRUID information (i.e., serial-number etc.), combined with side-band and/or out-of-band communication can be implemented. However, even if these techniques are applied, it is still an issue that establishing the relevant state and connectivity information requires an active, “intelligent” entity (e.g., some kind of basic service processor with relevant firmware) associated with all the end points to which the cable is connected. Hence, inherently, this also implies that the end-points must be operating in at least a minimal power mode.
In some cases, it may not be possible to include an intelligent entity in the end-point design (e.g., a line-card implementation with no “side-band” access from any chassis/system service processor to the cable-connectors on the line-card, or an un-intelligent repeater module used to connect two individual cables together). In such cases, none of the desired information would be available until the link(s) associated with the cable connectors had been made operational and/or it might not be possible to determine the complete physical connectivity information.
Accordingly, the invention has been made, at least in part, in consideration of problems and drawbacks of conventional systems.
An embodiment of the invention can provide cable connector for attaching a cable, the cable connector comprising a storage device operable to store an identifier that identifies a cable end-point. An embodiment of the invention can also provide a cable comprising such a connector at a first end thereof. An embodiment of the invention can also provide a computer system comprising a plurality of system components that have component connectors, and at least one such cable that interconnects system components. The identifier can be a field replaceable unit (FRU) identifier (FRU-ID) that can uniquely identify the cable end point.
In an embodiment of the invention, in order to monitor connectivity status associated with an interconnect cable from the end-points to which either end of the cable is attached, a field replaceable unit identifier that uniquely identifies a cable end-point and is stored in such a storage device can be accessed to determine the connectivity status.
Although various aspects of the invention are set out in the accompanying independent and dependent claims, other aspects of the invention include any combination of features from the described embodiments and/or the accompanying dependent claims, possibly with the features of the independent claims, and not solely the combinations explicitly set out in the accompanying claims.
Specific embodiments are described by way of example only with reference to the accompanying Figures in which:
While the invention is susceptible to various modifications and alternative forms, specific embodiments are shown by way of example in the drawings and are herein described in detail. It should be understood, however, that drawings and detailed description thereto are not intended to limit the invention to the particular form disclosed, but on the contrary, the invention is to cover all modifications, equivalents and alternatives falling within the spirit and scope of the present invention.
An example embodiment of a 3456-port InfiniBand 4×DDR switch in a custom rack chassis is described, with the switch architecture being based upon a 5-stage CLOS fabric. The rack chassis can form a switch enclosure.
The CLOS network, first described by Charles Clos in 1954, is a multi-stage fabric built from smaller individual switch elements that provides full-bisectional bandwidth for all end points, assuming effective dispersive routing.
Given that an external connection (copper or fiber) costs several times more per port than the silicon cost, the key to make large CLOS networks practical is to minimize the number of external cables required and to maximize the number of internal interconnections. This reduces the cost and increases the reliability. For example, a 5-stage fabric constructed with switching elements of size (n) ports supports (n*n/2*n/2) edge points, using (5*n/2*n/2) switch elements with a total of (3*n*n/2*n/2) connections. The ratio of total to external connections is 5:1, i.e. 80% of all connections can be kept internal. The switch elements (switch chips) in the described example can be implemented using a device with 24 4×DDR ports.
An example switch uses a connector that support 3 4× ports per connector, which can further to minimize a number of cables needed. This can provides a further 3:1 reduction in the number of cables. In a described example, only 1152 cables (1/3*n*n/2*n/2) are required.
In contrast if prior commercially available 288-port switches and 24-port switches were used to create a 3456-port fabric a total of 6912 cables (2*n*n/2*n/2) would be required.
The example switch can provide a single chassis that can implement a 5-stage CLOS fabric with 3456 4×DDR ports. High density external interfaces can be provided, including fiber, shielded copper, fiber and twisted pair copper. The amount of cabling can be reduced by 84.4% when compared to building a 3456-port fabric with commercially available 24-port and 288-port switches. In the present example, an orthogonal midplane design can be provided that is capable of DDR data rates.
An example switch can address a full range of HPC cluster computing from a few hundred to many thousand of nodes with a reliable and cost-effective solution that uses fewer chassis and cables than prior solutions.
In the present example, up to 18 fabric cards (FC0 to FC17) 12,
In the present example, up to 24 line cards (LC0 to LC23) 14,
Up to 16 hot-pluggable power supply units (PS0-PS16) 16,
Two hot-pluggable Chassis Management Controllers (CMCs) 18,
The power distribution board is a passive power distribution board that supports up to 16 power supply units DC connectors and 2 chassis management controller slot connectors. The power distribution board connects to the midplane through ribbon cables that carry low-speed signals.
In the present example, up to 144 fan modules (Fan#0-Fan#143) 20 are provided, with 8 fan modules per fabric card 12 in the present instance. Cooling airflow in controlled to be from the front to the rear, using redundant fans on the fabric cards to pull the air from the line cards 14 through openings (not shown in
The midplane 30 is represented schematically to show an array of midplane connector pairs 32 as black squares with ventilation openings shown as white rectangles. Each midplane connector pair 32 comprises a pair of connectors (to be explained in more detail later) with one connector on a first face of the midplane and a second connector on the other face of the midplane, the first and second connectors being electrically interconnected by way of pass-through vias (not shown in
In an example described herein, each of the first connectors of the respective midplane connector pairs 32 of a column 31 of midplane connector pairs 32 can be connected to one fabric card 12. This can be repeated column by column for successive fabric cards 12. In an example described herein, each of the second connectors of the respective midplane connector pairs 32 of a row 33 of midplane connector pairs 32 can be connected to one line card 14. This can be repeated row by row for successive line cards 14. As a result, the midplane can be populated by vertically oriented fabric cards 12 on the first side of the midplane and horizontally orientated line cards 12 on the second side of the midplane 30.
In the present example the midplane 30 provides orthogonal connectivity between fabric cards 12 and the line cards 14 using orthogonal connector pairs. Each orthogonal connector pair provides 64 differential signal pairs, which is sufficient to carry the high-speed signals needed as well as a number of low-speed signals. The orthogonal connector pairs are not shown in
The midplane 30 is also configured to provide 3.3 VDC standby power distribution to all cards and to provide I2C/System Management Bus connections for all fabric cards 12 and line cards 14.
Another function of the midplane 30 is to provide thermal openings for a front-to-rear airflow. The white holes in
The fabric cards 12 each support 24 connectors and the line cards 14 each support 18 connectors.
As previously mentioned a 5-stage Clos fabric has a size n*n/2*n/2 in which n is the size of the switch element. The example switch element in
There are 18 midplane connectors 32 per line card 14. Each midplane connector 32 provides one physical connection to one fabric card 14. Each midplane connector 32 can accommodate 8 4× links (there are 8 differential pairs per 4× link and a total of 64 differential pairs provided by the orthogonal connector)
12 ports of each of the switch chips 35 in the second row 38 of the line card 14 are connected to 2 line card connectors 40 that are used to connect the line card 14 to the midplane connectors 32 and thereby with the fabric cards 12 through the orthogonally oriented midplane connector pair. Of the 12 ports per switch chip 35, eight ports are connected to one line card connector 40, and the remaining four ports are connected to another line card connector 40 as represented by the numbers 8 and 4 adjacent the two left hand switch chips 35 in the second row 38. 2 switch chips are thereby connected to a group of 3 line card connectors 40 and hence to a group of three midplane connectors pairs 32.
The remaining 12 ports of each switch chip 35 in the second row 38 of the line card 14 are connected to each of the 12 switch chips 35 in the first row 36 of the line card 14.
At the fabric card 12 all links through an orthogonally oriented midplane connector pair 32 are connected to one line card 14. A single orthogonal connector 46 carries 8 links. These links are connected to one switch element 44 each at the fabric card 12.
Also shown in
There has been described a system with 24 line cards with 144 ports each, realized through 48 physical cable connectors that each carry 3 links. The switch fabric structure of each line card 14 is fully connected, so the line card 14 itself can be viewed upon as a fully non-blocking 144 port switch. In addition each line card 14 has 144 links that are connected to 18 fabric cards. The 18 fabric cards then connect all the line cards 14 together in a 5-stage non-blocking Clos topology.
In the present example the midplane 30 is a passive printed circuit board that has dimensions of 1066.8 mm (42″)×908.05 mm (35.75″)×7.1 mm (0.280″). The active area is 40″×34″. 864 8×8 midplane connectors (432 midplane connectors per side) are provided. There is a ribbon cable connection the power distribution board 22 and a 3.3V standby copper bar to the power distribution board 22.
In the present example a fabric card 12 comprises a printed circuit board with dimensions of 254 mm (10″)×1016 mm (40″)×4.5 mm (177″). It comprises 24 8×8 fabric card connectors 46, one power connector 39, 8 fan module connectors and 8 switch chips 44.
In the present example a line card 14 comprises a printed circuit board with dimensions of 317.5 mm (12.5″)×965.2 mm (38″)×4.5 mm (177″). It comprises 24 stacked cable 168-circuit connectors 42, 18 8×8 card connectors 40, 1 busbar connector and 24 switch chips 35.
In the present example a power distribution board 22 comprises a printed circuit board, 16 power supply DC connectors, 14 6×6 card connectors (7 connectors per chassis management card 18, ribbon cable connectors for low-speed connectivity to the midplane 30, and a 3.3V standby copper bar to the midplane 30.
In the present example a chassis management card 18 comprises 14 6×6 card connectors (7 connectors per chassis management card), two RJ45 connectors for Ethernet available on a chassis management card panel, two RJ45 connectors for serial available at the chassis management card panel, .three RJ45 for line card/fabric card debug console access at the chassis management card panel, three HEX rotary switches used to select between which line card/fabric card debug console is connected to the three RJ45s above, and a 220-pin connector for the mezzanine.
In the present example a mezzanine has dimensions: 92.0 mm×50.8 mm and comprises 4 mounting holes screw with either 5 mm or 8 mm standoff from the chassis management card board, a 220-pin connector for connectivity to chassis management board.
It will be noted that the second connector 64 of the midplane connector pair 32 is rotated through substantially 90 degrees with respect to the first connector 62. The first connector 62 is configured to connect to a corresponding fabric card connector 46 of a fabric card 12. The second connector 62 is configured to connect to a corresponding fabric card connector 46 of a line card 14. Through the orientation of the second connector 64 of the midplane connector pair 32 substantially orthogonally to the orientation of the first connector 62, it can be seen that the line card 14 is mounted substantially orthogonally to the fabric card 12. In the present example the line card 14 is mounted substantially horizontally and the fabric card is mounted substantially vertically 12.
Each of the contact pins on the connector 62 is electrically connectable to a corresponding contact of the fabric card connector 46. Each of the contact pins on the connector 64 is electrically connectable to a corresponding contact of the line card connector 40. The connector pins of the respective connectors 62 and 64 are connected by means of pass-through vias in the midplane 30 as will now be described in more detail.
As can be seen in
By comparing
The first midplane connector 62 (fabric card side connector) of the midplane connector pair 32 has substantially the same form as the second midplane connector 62 of the midplane connector pair 32, except that it is oriented at substantially 90 degrees to the second midplane connector 64. In this example the second midplane connector 62 comprises a substantially U-shaped support frame 75 including a substantially planar base and first and second substantially walls and that extend at substantially at 90 degrees from the base. The inside edges of the first and second substantially planar sides are provided with ridges and grooves that provide guides for the fabric card connector 46. The fabric card connector 46 has the same basic structure as that of the line card connector 40 in the present instance. Thus, in the same way as for the line card connector, each of a plurality of contact planes of the fabric card connector 46 can be entered into a respective one of the grooves so that connectors of the fabric card connector 46 can then engage with contact pins of the first connector 62. The orientation of the first connector 62 and the grooves therein means that the fabric card 12 is supported in a substantially vertical orientation.
In the example illustrated in
As mentioned above, the contact pins of the first and second midplane connectors 62 and 64 of a midplane connector pair 32 are connected by means of pass through vias in the midplane.
In use, the other midplane connector (e.g., the first midplane 62) of the midplane connector pair would be inserted into the pass through vias in the other side of the midplane 30 in the orthogonal orientation as discussed previously.
The examples of the midplane connectors described with reference to
It will be appreciated that in other examples the first and second midplane connectors could have different shapes and/or configurations appropriate for the connections for the cards to be connected thereto.
The array of midplane connector pairs 32 as described above provides outstanding performance in excess of 10 Gbps over a conventional FR4 midplane because the orthogonal connector arrangements allow signals to pass directly from the line card to the fabric card without requiring any signal traces on the midplane itself. The orthogonal arrangements of the cards that can result from the use of the array of orthogonally arranged connector pairs also avoids the problem of needing to route a large number of signals on the midplane to interconnect line and fabric cards, minimizing the number of layers required. This provides a major simplification compared to existing fabric switches. Thus, by providing an array of such orthogonal connectors, each of a set of horizontally arranged line cards 12 can be connected to each of a set of vertically aligned fabric cards without needing intermediate wiring.
The air inlet is via perforations at the line card 14 front panel. Fans 20 at the fabric cards 12 pull air across the line cards, though the openings 34 in the vertical midplane 30 and across the fabric cards 12.
Line card cooling is naturally redundant since the fabric cards are orientate orthogonally to the line cards. In other words, cooling air over each line card is as a result of the contribution of the effect of the fans of the fabric cards along the line card due to the respective orthogonal alignment. In the case that a fabric card fails or is removed, a portion of the cooling capacity is lost. However, as the cooling is naturally redundant the line cards will continue to operated and be cooled by the remaining fabric cards. Each fan is internally redundant and the fans on the fabric cards 12 can be individually hot swappable without removing the fabric card 12 itself. The fabric card 12 and line card 14 slots can be provided with blockers to inhibit reverse airflow when a card is removed. Empty line card 14 and fabric card 12 slots can be loaded with filler panels that prevent air bypass.
Each power supply has an internal fan that provides cooling for each power supply. Fans at the power supplies pull air through chassis perforations at the rear, across the chassis management cards 18, and through the power supply units 16. Chassis management card cooling is naturally redundant as multiple power supply units cool a single the chassis management card.
It will be appreciated that changes and modifications to the above described examples are possible. For example, although in the present example cooling if provided by drawing air from the front to the rear, in another example cooling could be from the rear to the front.
Also, although in the above described examples the fabric cards and the switch cards are described as being orthogonal to each other, they do not need to be exactly orthogonal to each other. Indeed, in an alternative example they could be angled with respect to each other but need not be exactly orthogonal to each other.
Also, in the above described examples the midplane connector pairs 32 are configured as first and second connectors 62 and 64, in another example they could be configured as a single connector that is assembled in the midplane. For example, through connectors could be provided that extend through the midplane vias. The through connectors could be manufactured to be integral with a first connector frame (e.g., a U-shaped frame or a box-shaped frame as in
An example cable-based switch chassis can provide a very large single switch chassis having, for example, one or more of the following advantages, namely a 3456 ports non-blocking Clos (or Fat Tree) fabric, a 110 Terabit/sec bandwidth, major improvements in reliability, a 6:1 reduction in interconnect cables versus leaf and core switches, a new connector with superior mechanical design, major improvement in manageability, a single centralized switch with known topology that provides a 300:1 reduction in entities that need to be managed.
In such a cable-based interconnect system, the identification of local and remote cable connectivity is important for configuration and diagnosis of the cable-based interconnect system. This can include the determination of whether a cable is connected locally, whether it is connected at the remote end or ends (remote end for a single cable or remote ends for a split- or multi-link cable) and where the remote end(s) is (are) connected.
In an example embodiment of the invention, in order to avoid the need for any active and intelligent entity on the remote side of a cable, access can be provided to information storage on the remote side via a low-power side-band channel associated with the cable. In one example, in which a side-band channel implements an I2C type link, dedicated I2C based storage devices can be implemented with each cable connector, and then these devices can be accessed from the remote end of the cable. The I2C based storage devices are arranged to contain relevant serial number (etc.) information can allow unique identification of the end-point. Such an example can provide an asymmetric scheme, whereby one side represents an active, intelligent entity whereas the other side represents only the (un-intelligent) storage device. In operation of this example, the “un-intelligent” side of the connection is set in a power state that allows access to the I2C devices. Hence, if no power is available at the un-intelligent end of the cable, no information is available.
In another example, in order to overcome the remote power-issue, power for the I2C device can be supplied over the cable from the active side. In this case, the I2C device is implemented in a way that allows it to record more state about the local side. For example, a local power mode (e.g., “off”, “aux” and “full”) can be provided. This information can be determined using a PLD with status input reflecting the corresponding dynamic state in addition to the static (e.g., fabric defined) serial number info.
In one example, to provide for both sides to be active, dual-master support can be provided on the I2C link. In addition, each side can have control logic that in the default (power-off) mode operate as above, but in addition indicates whether “active” is supported.
In such an example, either side can be active, and each side can determine if the remote end of the cable is connected, determine the ID of the other end, and also determine the state of the other end (including if the other end is in a mode where link training can be expected). Also, if “active” is indicated and the remote side is in at least an “aux” state, then more elaborate duplex communication can take place via the I2C link (e.g., using IPMI type messaging protocols).
In such an example where either side can be active (an active-active scheme), both sides can detect if the other end supplies power, and determine to what extent the local logic is powered locally or from remote.
In a situation where a remote-power supply is detected but repeated attempts to read the remote state information fails, this should be considered a special case of failure that indicates broken interface logic at the remote end.
Supplying power to a remote board that also has its own internal power supply system can increase the complexity of the board design, and can also increase cost. In view of this, in an example embodiment, powering of storage devices in a remote board is not supported and only support powering of a remote cable connector is supported. If in this example writeable “FRU info” storage is provided in a cable connector, it is possible dynamically to record information about an attached chassis in the cable connector and to have this be persistent and accessible to the remote side even if the local side is no longer operational. In other words, in this example, the cable connector can contain information about time and identity of last connectivity to an operational chassis. In this way, it is possible to verify connectivity where not all involved chassis instances are powered up and operational.
In the chassis component 420, a service processor 424 and field replaceable identifier (FRUID) storage 426 are selectively connected to the management signal line connection of the connector logic 434 via a switch 430 (e.g., a MOSFET) whereby the service processor 424 and the logic 434 have selective access to the FRUID information. As described above, in order to address situations where the chassis component may or may not be powered, from the local power source 428, the FRUID can receive power either from the local power source when this is active, or from the power line 414 of the cable 406 through a switch (e.g., a MOSFET) 432. in this way, both remote power supply and I2C access can be achieved via a cable and connector with power and access for dedicated devices on a (remote) FRU being achieved without mixing power domains. The switches (e.g., the MOSFETS) 430 and 432 can be used to control whether the local or the remote power domain is active. If powered from remote, then the I2C link can also be electrically isolated from the rest of the local FRU. The internal connector logic 434 can be used to control if power is supplied to the FRU from the cable.
In another example embodiment, remote I2C type side-band communication is not supported, but instead all information is exchanged either in-band, or via an (out-of-band) management network infrastructure. Also in this example it is desirable to be able to detect if the remote cable connector is plugged into a chassis connector, and to detect the power state of the remote chassis. This can provide state information where both end-point have not yet been identified, and allow both sides to detect transient or permanent state changes. This functionality does not need to depend on remote I2C type access to sensor devices, but instead it can be implemented by a “loopback signal” where the electrical behavior (as observed from the local side) is a function of whether the remote cable connector is plugged in, the local power mode, and also the power mode of the remote chassis. By having this information monitored, for example by a hardware level sensor mechanisms that records the state of the signal continuously, transient state changes can be recorded without depending on any real-time poll frequency which a pure remote I2C sensor based scheme would require.
An example embodiment of the invention can provide the ability to observe all major aspects of the connectivity associated with an interconnect cable from the end-points to which either end of the cable is attached without any dependency on whether the remote end is connected or not, and irrespective of the state of the remote end-point.
An example embodiment can provide for the identification of local and remote cable connectivity, whereby the complexity associated with side-band based information can be traded against the use of out-of-band mechanisms in ways that still allows all relevant information to be handled.
An example embodiment can combine cable connector based active logic and (low-power) power supply via the cable with chassis side logic in order to allow observation of remote end-point ID and state without depending on any minimal operational/power state for the remote end-point.
Dynamic single and dual master cable side-band can be combined with local or remote low-power support in order to provide remote cable end-point ID and state independently of the operational state of the remote system.
Both remote power supply and I2C access can be achieved via a cable and connector. Power and access for dedicated devices on a (remote) FRU can be achieved without mixing power domains, but with reduced ability of access from the FRU (local) side.
MOSFET switches can be used to control whether a local or a remote power domain is active. If powered from remote, then an I2C link can also be electrically isolated from the rest of the (local) FRU. The connector may have internal logic also to control if power is supplied to the FRU.
With reference to
Accordingly, in an embodiment of the invention, in order to monitor connectivity status associated with an interconnect cable from the end-points to which either end of the cable is attached, a storage device storing a field replaceable unit identifier can be provided to uniquely identify a cable end-point, which identifier can then be accessed to determine the connectivity status.
Although the embodiments above have been described in considerable detail, numerous variations and modifications will become apparent to those skilled in the art once the above disclosure is fully appreciated.
This application hereby claims priority under 35 U.S.C. §119 to U.S. Provisional Patent Application No. 60/945,778, filed on 22 Jun. 2007, entitled “CABLE INTERCONNECT SYSTEMS”, by inventor(s) Bjorn Johnsen et al. The present application hereby incorporates by reference the above-referenced provisional patent application.
Number | Date | Country | |
---|---|---|---|
60945778 | Jun 2007 | US |