In networking environments such as those used in telecommunication and/or data centers, a switch fabric is utilized to rapidly move data. Typically a switch fabric provides a communication medium that includes one or more point-to-point communication links interconnecting one or more nodes (e.g., endpoints, switches, modules, blades, boards, etc.). The switch fabric may operate in compliance with industry standards and/or proprietary specifications. One example of an industry standard is the Advanced Switching Interconnect Core Architecture Specification, Rev. 1.1, published November 2004, or later version of the specification (“the ASI standard”).
Typically a switch fabric includes a switch fabric management architecture to maintain a highly available communication medium and to facilitate the movement of data through the switch fabric. One part of the fabric management architecture is to manage/control the configuration of each node coupled to the edge of the switch fabric (e.g. an endpoint) or a node coupled within the switch fabric (e.g., a switch). As part of a typical fabric management architecture, an active and a standby fabric manager manage/control at least a portion of each node's switch fabric configuration as well as the communication links that may interconnect the nodes coupled to the switch fabric.
In one example, one or more fabric managers are selected/elected for a switch fabric. Once elected, a fabric manager gains ownership of a spanning tree (ST) path. The ST path may include a particular route or path through which an owning fabric manager forwards instructions to other nodes coupled to the switch fabric. Ownership may grant the fabric managers privileged access to the configuration registers for these nodes to configure the nodes to operate on the switch fabric. Thus, a node receiving a configuration request ignores the request if the request was not routed via the ST path associated with an owning fabric manager.
a-e are example illustrations of elements of a switch fabric to include paths to send heartbeat messages between active and standby fabric managers;
As mentioned in the background a typical switch fabric may include an active and a standby fabric manager. In general, a fabric manager is logically associated with or responsive to an endpoint for the switch fabric. The endpoint may include resources (e.g., processing power, memory, etc.) to support the fabric manager. In one example, a fabric manager may be initiated by instructions included in a memory accessible by a processor or control logic on the endpoint. The instructions may also enable the endpoint's control logic to determine whether it will support an active and/or a standby fabric manager for the switch fabric.
In one implementation, an active and a standby fabric manager may monitor the health of each other. For example, each fabric manager may send a message (e.g., heartbeat message) that provides a status (health) of the respective fabric manager. In one example, heartbeat messages are packet-based and indicate the operating or functional status of a fabric manager (e.g., fully or adequately operational). These heartbeat messages may be sent via paths through the switch fabric. When a fabric manager fails to receive a heartbeat message from the other fabric manager, the fabric manager may assume the other fabric manager has failed, e.g., no longer fully operational or coupled to the switch fabric. The fabric manager may then take corrective actions, e.g., failover to become the active fabric manager, select a new standby fabric manager, reset the switch fabric, etc.
In one example, failure of a fabric manager to detect a heartbeat message from another fabric manager may occur even if the other fabric manager has not failed. In this example, failure to detect a heartbeat message from the other fabric manager may be caused by a failed communication link or node. The failed communication link or node may fall along the path in the switch fabric that is used by the other fabric manager to send its heartbeat message. Since a fabric manager may take corrective actions that assume the other fabric manager has failed, this may cause the switch fabric to become unstable as both fabric managers may vie to be the active fabric manager and/or each may select additional fabric managers to replace the supposedly failed other fabric manager. This unstable fabric is problematic in networking systems where high availability and reliability is important and tolerance for an unstable fabric is low.
In one implementation, an endpoint node for a switch fabric includes a fabric manager. This fabric manager may be an active fabric manager for the switch fabric. The endpoint node may also include failover logic responsive to the fabric manager to detect a heartbeat message from a standby fabric manager for the switch fabric. The heartbeat message to be sent from the standby fabric manager via a path in the switch fabric.
The failover logic may set a timer for a duration and reset the timer based on detection of the heartbeat message from the standby fabric manager. If the heartbeat message is not detected after the timer has expired, then the failover logic may obtain a topology of the switch fabric. Based at least in part on the topology, the failover logic may determine whether the standby fabric manager has failed. If the standby fabric manager has failed, the failover logic may failover to another standby fabric manager. If the standby fabric manager has not failed, no failover occurs and the failover logic sends a message from the active fabric manager to the standby fabric manager. The message may indicate another path in the switch fabric for the standby fabric manager to send another heartbeat message to the active fabric manager.
a is an example illustration of elements of switch fabric 100. As shown in
In one example, switch fabric 100 is operated in compliance with the ASI standard. Although this disclosure is not limited to only switch fabrics that operate in compliance with the ASI standard. As depicted in
In one implementation, endpoints 110, 111, 113 and 116 may have indicated an ability to support or willingness to allocate the resources to support a fabric manager. This indication to occur, for example, during initialization of switch fabric 100. Based on each node's indicated ability to support a fabric manager, ASI compliant switch fabric 100 may follow a process described in the ASI standard to elect/select a primary or active fabric manager and a secondary or standby fabric manager. In one example, as depicted in
In one example, as depicted in
In one implementation, the active fabric manager 101 in endpoint 110 and the standby fabric manager 101 in endpoint 116 may communicate their status or health to each other by sending packet-based heartbeat messages to each other. These heartbeat messages may be routed via one or more paths within an ASI compliant switch fabric 100. In one example, these paths may be based on the topology of switch fabric 100. This topology, in one example, is determined/obtained by the primary or active fabric manager following election of that active fabric manager. To obtain the topology, the active fabric manager may complete an enumeration/discovery process described in the ASI standard.
In one example, active fabric manager 101 in endpoint 110 may have obtained a topology of switch fabric 100 that is depicted in
As described in more detail below, each endpoint's fabric manager 101 may detect heartbeat messages sent-from one fabric manager to another fabric manager 101 via one or more paths in switch fabric 100 (e.g., paths 140 and 141). In one example, based on a lack of detection of a heartbeat message, a fabric manager may take corrective actions to include using another path to receive or send heartbeat message to another fabric manager or failover to another endpoint that has indicated the resources to support a fabric manager.
Failure to detect a heartbeat message sent along a path in a switch fabric may be the result of a broken path. Causes of a broken path may include, but is not limited, to an element (e.g., switch, endpoint, communication link) failing, malfunctioning or being removed from the fabric. Intermittent failures that may not be detected when an updated topology is obtained by a fabric manager may also lead to a failure to detect a heartbeat message. In one example, based on a failover policy, if after obtaining a topology that reflects no failed elements in a given path, a subsequent failure to detect a heartbeat sent via the given path may indicate an intermittent failure. This intermittent failure may cause the fabric manager to select a different path to send heartbeat messages.
In one example, switch 103 of switch fabric 100 may fail or is removed. As a result, path 141 used by the standby fabric manager 101 in endpoint 116 is broken. So active fabric manager 101 in endpoint 110 is unable to detect heartbeat messages from standby fabric manager 101 in endpoint 116 via path 141. Based on not detecting the heartbeat, according to one example, active fabric manager 101 obtains a topology of switch fabric 100 to determine the operating status of the nodes or communication links in switch fabric 100. That obtained topology, in one example, is illustrated in
As depicted in
In one implementation, active fabric manager 101 in endpoint 110 may indicate to standby fabric manager 101 in endpoint 116 to send heartbeat messages via path 142 instead of the broken path 141. The standby fabric manager 101 may then stop using the broken path 141 and start to use path 142.
In another example, active manager 101 in endpoint 110 may fail to detect a heartbeat message and after obtaining a topology of switch fabric 100 finds that endpoint 116 has failed or was removed. That obtained topology, in one example is portrayed in
As depicted in
Referring back to
In one example, as depicted in
In one implementation, the duration has elapsed without receiving any messages and/or another heartbeat message from active fabric manager 101 in endpoint 110. So standby fabric manager 101 in endpoint 116 may obtain a topology of switch fabric 100. That topology, in one example, is depicted in
In one example, the new active fabric manager 101 in endpoint 116 may then select an endpoint to include the new standby fabric manager for switch fabric 100. As shown in
As briefly mentioned above, a fabric manager may be initiated by instructions included in a memory (not shown) accessible to an endpoint's control logic. The elements portrayed in
In
In one example, failover logic 210 may represent a portion of the resources allocated by an endpoint to support fabric manager 101. Thus, failover logic 210 may include an endpoint's microprocessor, network processor, microcontroller, field programmable gate array (FPGA), application specific integrated chip (ASIC), or executable content to implement detect feature 212, timer feature 214, topology feature 216 and select feature 218.
According to one example, memory 230 may be a portion of an endpoint's memory (not shown). Memory 230 may be used by failover logic 210 to temporarily store information. For example, information related to the selection of paths to route heartbeat messages or select fabric managers on a switch fabric. Memory 230 may also include encoding/decoding information to facilitate or enable the detection of packet-based heartbeat messages and communicating a path change or a failover based on an obtained topology following a failure to detect one or more heartbeat messages.
I/O interfaces 240 may provide a communications interface via a communication medium or link between fabric manager 101 and a node or an electronic system. As a result, I/O interfaces 240 may enable control logic 220 or failover logic 210 to receive a series of instructions from application software external to the elements allocated to support fabric manager 101. The series of instructions may activate control logic 220 or failover logic 210 to implement one or more features of fabric manager 101.
In one example, fabric manager 101 includes one or more applications 250 to provide internal instructions to control logic 220 or other resources allocated to support fabric manager 101 (e.g., failover logic 210). Such applications 250 may be activated to generate a user interface, e.g., a graphical user interface (GUI), to enable administrative features, and the like. For example, a GUI may provide a user access to memory 230 to modify or update information to facilitate the detection of a heartbeat message and communicating a path change or a failover based on an obtained topology following a failure to detect the heartbeat message.
In another example, applications 250 may include one or more application interfaces to enable external applications to provide instructions to control logic 220 or failover logic 210. One such external application could be a GUI as described above.
In one implementation, ASI compliant switch fabric 100 has completed its initialization and both active and standby fabric managers have been elected as depicted by the topology in
In block 310, according to one example, failover logic 210 for active fabric manager 101 in endpoint 110 activates detect feature 212. Detect feature 212 may monitor path 141 for heartbeats from standby fabric manager 101 in endpoint 116. Failover logic 210 also activates timer feature 214 to set a timer for a duration. If the timer expires before detect feature 212 detects a heartbeat message from the standby fabric manager 101 in endpoint 116, the process moves to block 320. But if a heartbeat message is detected by detect feature 212, the process moves to block 315.
In one example, the timer duration may be based on one or more factors that may include, but is not limited to, the availability and reliability requirements of switch fabric 100. As a result, a requirement for very high availability and reliability may result in a low tolerance for periods of instability possibly encountered as a fabric manager takes corrective actions following failure to detect a heartbeat. So a short timer duration may be needed to minimize periods of instability. Additionally, the dependability or capability of elements of a switch fabric (e.g., endpoints, switches, communication links) that may fail, may also influence the timer duration. For example, elements that tend to fail more often need a shorter timer duration than elements that rarely fail. Elements that are relatively slow to failover may also need a shorter timer duration as compared to elements that are relatively fast to failover.
In one example, the timer duration may be a configurable duration that may be configured at the time switch fabric 100 is initialized. The timer duration may also be modified by a user (e.g., via I/O interfaces 240 or via applications 250's application interfaces) or dynamically configured based on past operating characteristics of switch fabric 100. For a dynamically configured timer duration, for example, if elements of switch fabric 100 show an increasing trend of failing, the timer duration may be shortened to account for this trend.
In block 315, detect feature 212 has detected the heartbeat message from standby manager 101 in endpoint 116. Based on the detection, timer feature 214 then resets the timer for the duration and the process returns to block 310.
In block 320, detect feature has not detected the heartbeat message from standby manager 101 in endpoint 116. In one example, failover logic 210 activates topology feature 216 to obtain an updated topology of switch fabric 100. As mentioned above, the updated topology may be obtained through an enumeration/discovery process such as described, for example, in the ASI standard. Topology feature 216 may temporarily store information associated with the updated topology, e.g., in memory 230.
In block 325, in one example, failover logic 210 activates select feature 218. Select feature 218 may access the updated topology temporarily stored by topology feature 216 to determine the status of standby fabric manager 101 in endpoint 116. If the updated topology shows that standby fabric manager 101 in endpoint 116 is still a functioning part of switch fabric 100, the process moves to block 330. If not, the process moves to block 355.
In block 330, in one example, the updated topology shows that standby fabric manager 101 in endpoint 116 is still a part of switch fabric 100's topology. Thus, it is likely that an element of switch fabric 100 has malfunctioned, failed, or has been removed. In one example, the topology depicted in
In block 335, in one implementation, active fabric manager 101 in endpoint 110 sends a message to standby fabric manager 101 in endpoint 116. The message indicates path 142 to send heartbeat messages. Standby fabric manager 101 in endpoint 116 then uses path 142 to send subsequent heartbeat messages.
In block 340, in one example, select feature 218 may also determine whether path 140 is broken. Path 140, as portrayed in
In block 345, select feature 218 may select another path in switch fabric 100 for active fabric manager 101 in endpoint 110 to send heartbeat messages to standby fabric manager 101 in endpoint 116. For example, select feature 218 may determine, based on the updated topology, that communication link 130k has failed or is malfunctioning. Thus, select feature 218 may select a path through switch fabric 100 that does not include communication link 130k.
In block 350, active fabric manager 101 in endpoint 110 uses the other path to send heartbeat messages to standby fabric manager 101 in endpoint 116. In one example, this other path is portrayed in
In block 355, in one example, select feature 218 has determined that standby fabric manager 101 in endpoint 116 is no longer part of switch fabric 100's topology. In one implementation, select feature 218 may determine whether there exists at least one other endpoint in the topology that indicates the ability to support a fabric manager. As depicted in
In block 360, in one example, select feature 218 selects paths to send heartbeat messages between the active manager 101 in endpoint 110 and the failed over standby fabric manager 101 in endpoint 113. These paths may follow the paths as portrayed in
In block 410, in one example, failover logic 210 for standby fabric manager 101 in endpoint 116 activates detect feature 212. Detect feature 212 may monitor path 142 for heartbeat messages from active standby manager 101 in endpoint 110. Failover logic 210 also activates timer feature 214 to set a timer for a duration. If the timer expires before detect feature 212 detects a heartbeat message from the active fabric manager 101 in endpoint 110, the process moves to block 420. But it a heartbeat message is detected before the timer expires, the process moves to block 415.
In block 415, in one example, based on the detection of the heartbeat message by detect feature 212, timer feature 214 resets the timer for duration “x”.
In block 420, in one example, based on detect feature 212 not detecting a heartbeat message, timer feature 214 may reset the timer for another duration portrayed as “y” in block 420. In one example, this other duration “y” may be determined based on the expected amount of time if may take active fabric manager 101 in endpoint 110 to send another heartbeat message via another path. This other duration “y” may be equal to or different than duration “x” described for block 415.
In one example, the other duration “y” in block 420 may also be based on the amount of time it may take a message to propagate through switch fabric 100. Duration “y” may also be based on the amount of time it may take the active fabric manager to obtain an updated topology and determine an alternative path to send a heartbeat message.
In one implementation, standby fabric manager 101 in endpoint 116 may receive a message from active fabric manager 101 in endpoint 110 that indicates it is still a part of switch fabric 100 and to expect another heartbeat message via an alternate given path. For example, the topology depicted in
In block 430, in one example, standby fabric manager 101 in endpoint 116 based on the timer set in block 420 expiring without receiving the other heartbeat, begins failover activities to become the active fabric manager for switch fabric 100. Thus, in this example, failover logic 210 for standby fabric manager 101 in endpoint 116 activates topology feature 216 to obtain a topology of switch fabric 100. Topology feature 216 may temporarily store information associated with the obtained topology in a memory, e.g., memory 230.
In block 435, in one example, failover logic 210 activates select feature 218. Select feature 218 may access the obtained topology to determine whether there exists at least one other endpoint in the topology that indicates the ability to support a fabric manager. As depicted in
In block 440, in one example, select feature 218 selects paths to send heartbeat messages between the failed over active fabric manager 101 in endpoint 116 and the newly selected standby fabric manager 101 in endpoint 111. These paths may follow the paths portrayed by paths 146 and 147 in
Referring again to switch fabric 100 in
In one implementation, switch fabric 100 may be part of a modular platform system operated in compliance with industry standards such as the PCI Industrial Computer Manufacturers Group (PICMG), Advanced Telecommunications Computing Architecture (AdvancedTCA) Base Specification, PICMG 3.0 Rev. 1.0, published Dec. 30, 2002, or later versions of the specification (“the AdvancedTCA standard”). Although this disclosure is not limited to only AdvancedTCA compliant modular platform systems but may also include systems operated in compliance with other industry standards such as, Peripheral Component Interconnect (PCI), Compact Peripheral Component Interface (cPCI), VersaModular Eurocard (VME), or other types of industry standards governing the design and operation of systems that may include a switch fabric.
In one example, elements of switch fabric 100 are designed to operate in compliance with and to forward data using one or more communication protocols described by sub-set specifications to the AdvancedTCA specification. These sub-set specifications are typically referred to as the “PICMG 3.x specifications.” The PICMG 3.x specifications include, but are not limited to, Ethernet/Fibre Channel (PICMG 3.1), Infiniband (PICMG 3.2), StarFabric (PICMG 3.3), PCI-Express/Advanced Switching Interconnect (PICMG 3.4), Advanced Fabric Interconnect/S-RapidIO (PICMG 3.5) and Packet Routing Switch (PICMG 3.6).
Referring again to memory 230 in
References made in the specification to the term “responsive to” are not limited to responsiveness to only a particular feature and/or structure. A feature may also be “responsive to” another feature and/or structure and also be located within that feature and/or structure. Additionally, the term “responsive to” may also be synonymous with other terms such as “communicatively coupled to” or “operatively coupled to,” although the term is not limited in his regard.
In the previous descriptions, for the purpose of explanation, numerous specific details were set forth in order to provide an understanding of this disclosure. It will be apparent that the disclosure can be practiced without these specific details. In other instances, structures and devices were shown in block diagram form in order to avoid obscuring the disclosure.