This description relates to event delivery in switched fabric networks.
Advanced Switching Interconnect (ASI) is a technology based on the Peripheral Component Interconnect Express (PCI Express) architecture and enables standardization of various backplanes. The Advanced Switching Interconnect Special Interest Group (ASI-SIG) is a collaborative trade organization chartered with providing a switching fabric interconnect standard, specifications of which, including the Advanced Switching Core Architecture Specification, Revision 1.1, November 2004 (available from the ASI-SIG at www.asi-sig.com), it provides to its members.
ASI utilizes a packet-based transaction layer protocol that operates over the PCI Express physical and data link layers. The ASI architecture provides a number of features common to multi-host, peer-to-peer communication devices such as blade servers, clusters, storage arrays, telecom routers, and switches. These features include support for hot adding and removal of boards, redundant pathways, and fabric management fail-over.
The ASI architecture defines an event notification protocol that enables an ASI device (e.g., an ASI endpoint, switch, or bridge) to notify an agent of a condition that has been detected by the device. Such conditions include conditions associated with requests at the packet origin, packets flowing through a switch, packet delivery at the destination, or a change in device hardware state (i.e., an error condition). The number of conditions varies from device to device.
Generally, when an ASI device detects a condition that warrants sending an event, the event is sent to an event handler identified in the ASI Event Capability Structure of the device, or if the event is related to a problem with a specific forward routed packet, the event is sent to the packet origin if an event table is so configured. In the former case, each event (or class of events) is associated with a path (“event path”) specified in a register of the device's ASI Event Capability Structure. The register defines path information that is used by the device to build an event packet to be sent to the event handler. There may be instances in which two ASI devices are configured with event paths that route events over the link connecting the two ASI devices. Problems arise when the device connecting link fails or is removed for any reason, as the events generated by the two ASI devices are routed through the removed/failed link. Consequently, the event handler remains unaware of the detected condition and does not take any corrective action. This may result in an instability in the fabric, which is detrimental to its operation.
Referring to
Each ASI device 102, 104 has an ASI interface that is part of the ASI architecture defined by the Advanced Switching Core Architecture Specification (“ASI Specification”). The ASI architecture utilizes a packet-based transaction layer protocol 202 that operates over the PCI Express physical and data link layers 204, 206, as shown in
ASI uses a path-defined routing methodology in which the source of a packet provides all information required by a switch (or switches) to route the packet to the desired destination.
The PI field 308 in the ASI route header 302 determines the format of the encapsulated packet 304. The PI field 308 is inserted by the ASI end point 104 that originates the ASI packet and is used by the ASI end point 104 that terminates the packet to correctly interpret the packet contents. The separation of routing information from the remainder of the packet enables an ASI fabric to tunnel packets of any protocol.
PIs represent fabric management and application-level interfaces to the switched fabric network 100. Table 1 provides a list of PIs currently supported by the ASI Specification.
PIs 0-7 are used for various fabric management tasks, and PIs 8-126 are application-level interfaces.
The ASI architecture supports the implementation of an ASI Configuration Space in each ASI device 102, 104 of the network. The ASI Configuration Space is a storage area that includes fields to specify device characteristics as well as fields used to control the ASI device. The ASI Configuration Space includes up to 16 apertures where configuration information can be stored. Each aperture includes up to 4 Gbytes of storage and is 32-bit addressable. The configuration information is presented in the form of capability structures and other storage structures, such as tables and a set of registers. One of the capability structures defined by the ASI Specification and stored in aperture 0 of the ASI Configuration Space is the ASI Event Capability Structure. The ASI Event Capability Structure can be accessed through node configuration packets, e.g., PI-4 packets, as described in more detail below.
Referring to
Once a fabric manager declares ownership, it has privileged access to the ASI Configuration Space of each of its ASI devices 102, 104. The fabric manager utilizes its ability to read and write to the ASI Configuration Space of each of its ASI devices 102, 104 to perform (502) a fabric discovery process, in which the fabric manager records which ASI devices 102, 104 are connected, collects information about each ASI device 102, 104 in the network, and constructs a topology of the fabric. The fabric manager then uses a spanning tree algorithm to determine a spanning tree of the fabric.
For each ASI device 102, 104 in the network 400, the fabric manager uses the spanning tree to determine (506) a shortest path between the ASI device 102, 104 and an ASI end point that has been designated as an event handler for the fabric. Generally, any ASI end point 104 that has an event handler software 404b in its memory 460 can be designated as the event handler for the fabric. The fabric manager then builds (508) a PI-4 write packet having a packet header that specifies an aperture number and address corresponding to a register of the ASI device's ASI Event Capability Structure, and a payload that specifies path information defined by the shortest path between the ASI device 102, 104 and the event handler. The PI-4 packet is then sent (510) by the fabric manager to the ASI device 102, 104.
Upon receipt (512) of the PI-4 write packet, the ASI device 102, 104 processes (514) a write command to write data extracted from the payload of the PI-4 write packet to the register specified in the PI-4 packet header. In so doing, the event path specified (516) in the register of the ASI Event Capability Structure is defined by the shortest path information.
Two event paths 410a, 410b are depicted in the illustrated example of
In a scenario in which the device connecting link 406c fails or is removed for any reason, both of the ASI switch elements 402a, 402b will each independent of the other detect the link failure/removal condition, generate a corresponding link event, and attempt to send the link event to the event handler for processing. By configuring the two ASI devices 402a, 402b such that the event paths 410a, 410b do not both include the device connecting link 406c, a link event generated by at least one ASI device (in this case, the ASI switch element 402b) is guaranteed to be delivered successfully to the event handler. In so doing, corrective action can be taken by the event handler, thus preserving the stability of the fabric.
The techniques of one embodiment of the invention can be performed by one or more programmable processors executing a computer program to perform functions of the embodiment by operating on input data and generating output. An apparatus of one embodiment of the invention can be implemented as special purpose logic circuitry, e.g., one or more FPGAs (field programmable gate arrays) and/or one or more ASICs (application specific integrated circuits).
Processors suitable for the execution of a computer program include, by way of example, both general and special purpose microprocessors, and any one or more processors of any kind of digital computer. Generally, a processor will receive instructions and data from a memory (e.g., memory 450, 460 of
The invention has been described in terms of particular embodiments. Other embodiments are within the scope of the following claims. For example, the steps of an implementation of the invention can be performed in a different order and still achieve desirable results.