The present disclosure relates to Fibre (Fiber) Channel fabrics.
The Fibre (Fiber) Channel (FC) standard addresses the general need in networking for fast transfers of large amounts of information. FC networks utilize an arrangement of switches, referred to as an FC fabric, to connect various computing devices (e.g., storage arrays, servers, etc.). This approach simplifies the overhead associated with network traffic, since computing devices with FC ports only manage a point to-point connection between those FC ports and the FC fabric.
Presented herein are priority route based techniques for mitigation of slow drain devices in a Fibre Channel (FC) fabric that comprises a plurality of FC switches. In accordance with examples presented herein, a first FC switch in a FC fabric receives an indication that a first computing device attached to the FC fabric has entered a slow drain condition. The first FC switch is configured to prepare and install a priority route for packet flows directed to the first computing device.
The FC fabric 50 includes one or more Inter-Switch Links (ISLs) extending between the FC switches 25(1) and 25(2). For ease of illustration, only two Inter-Switch Links, referred to herein as Inter-Switch Links 30(1) and 30(2), are shown in
In one specific example of
An FC network, such as network 10, is a no-drop network that operates on a credit-based flow control mechanism for communication between any pair of ports. A buffer-to-buffer (“B2B”) credit number for a peer port tracks the number of packet buffers available on a peer port for packet transmission toward that port. As such, an FC packet may be transmitted by a port only if it has B2B credit at the peer port that is greater than zero. In operation, a packet transmitted from a port decrements its associated B2B credit counter. An acknowledgement of completion of processing of a packet takes the form of a Receiver Ready (“R_RDY”) primitive signal from the peer port, which increments the B2B credit counter. That is, the R_DY primitive signal is used to inform the transmitter that a new buffer has become available at the receiver. The R_RDY primitive only contains information indicating that a buffer is available at the port sending the R_RDY and, in general, no other information (i.e., information indicating that a new buffer available for a link).
Virtual linking is a mechanism in certain FC switches where the interface credits are carved per virtual link (VL) and flow control is managed by Extended RRDY (ER_RDY), which works at a VL level. The ER_RDY primitive is a Cisco® proprietary FCMAC IP implementation, which is used as an alternative to R_RDY. Cisco® is a registered trademark of Cisco Systems, Inc. In general, the ER_RDY primitive divides the physical link into multiple priority/virtual links (VLs). VLs are individually credited and flow controlled and ER_RDY carries credit information on a per VL basis.
A “slow drain device” is a device that is draining packets slower than its link speed (i.e. is in a slow drain condition). That is, a device in a slow drain condition does not accept frames at the rate generated by a source (i.e., the R_RDY signals are delayed in response to the frames). A “stuck device” is a device is an example of a severe slow drain condition where the device completely stops accepting frames from a source (i.e., the R_RDY signals are not returned in response to frames). This can lead to a head of line (HOL) blocking, when multiple flows (e.g., several good flows and a slow flow) share the same ISL link. The slow flow could use the entire link credits, thus causing other good flows to slow down, thereby causing link underutilization. As used herein, a “slow drain device” (i.e., a device in a slow drain condition) includes both slow drain and stuck devices.
A slow drain condition can arise as a result of a number of different issues, such as problems in the device Operating System (“OS”) or host bus adapters (“HBAs”), storage issues, switch misconfigurations (e.g., speed mismatches), application issues, etc. Many slow drain device conditions are due to devices being overwhelmed by large chunks of data being received from a storage device. This may be particularly prevalent in large SAN installations (e.g., 25-30 slow drain devices per day).
In the presence of slow drain devices, FC networks are likely to run out of switch packet buffers, resulting in switch port credit starvation and potential choking of Inter-Switch Links. Due to head-of-line blocking, an Inter-Switch Link running out of B2B credits results in traffic flows that are unrelated to the slow drain device being impacted. That is, in a FC network, the Inter-Switch Links (i.e., ISL ports) are shared to carry traffic for multiple flows with different source identifiers (SIDs) and destination identifiers (DIDs) (i.e., different SID-DID flows) and, as a result, slowness in one device in the network can impact other devices in the network, even if those devices do not communicate with the slow drain device.
For example, referring specifically to the arrangement of
A problem may arise where the target device 20(2) reaches a point where its ingress queue is filled, meaning that the ingress queue of target device 20(2) is unable to send R_RDY signals (i.e., T2's transmission B2B=0). In other words, the target device 20(2) enters a slow drain condition and, accordingly, is referred to as slow drain device. As noted, the slowness of target device 20(2) can impact the traffic running between other devices in the FC fabric 50 which share the Inter-Switch Links 30(1) and 30(2). Presented herein are techniques to mitigate the affect that a slow drain device has on other devices in a network using priority routing for the flows from the slow drain device. As described further below, the techniques presented herein have two primary aspects, namely the propagation of information about the slow drain device across the FC fabric, and subsequently mitigating the impact of the slow drain device via priority routing.
Referring first to the propagation of slow drain information, certain FC switches include mechanisms (e.g., hardware and/or software mechanisms) for identifying slow drain devices. Such mechanisms may include, for example, tracking an amount of time spent waiting for credits (B2B=0) on a port with a configurable timeout threshold (e.g., 100 milliseconds (ms)). Employing this specific mechanism, once the wait time for a frame on a port exceeds the designated threshold, the connected device is deemed to be a slow drain device (or stuck). Once the slow drain device is identified, the techniques presented herein use, for example, the Name Server infrastructure to propagate this information to all other switches in the fabric.
Slow drain device information may include, for example, a list of slow drain devices (i.e., one or more devices determined to be experiencing a slow drain condition) and/or a list of devices that have transitioned from a slow drain condition within a previous time period. There are, in general, two scenarios for distribution of slow drain device information, which include use of a Switch Internal Link Services (SW_ILS) command or use of the Name Server infrastructure. The SW-ILS command is used when a slow drain condition is detected and this condition same has to be communicated to the fabric, whereas the Name Server distribution is utilized when a new switch joins the fabric (i.e., to inform the new switch of the existing slow drain condition). For example, when a new domain joins an FC fabric to which a slow drain device is already connected (i.e., a slow drain condition already exists in the fabric), the slow drain device information may be carried within the get entries based on port type (GE_PT) payload. Typically GE_PT pulls information about the details of all the devices connected to a domain, but there are available bits in the device information field. The techniques presented herein may make use of these available bits in the device information field of the GE_PT payload to indicate that a particular device is experiencing a slow drain condition. As such, both the SW-ILS command and the Name Server mechanisms are utilized within the fabric for slow drain information, but each is used under different circumstances.
Once the information about the slow drain device is distributed to all the domains in the fabric, FC switches directly connected to devices zoned with (i.e., communicating with) the slow drain device are configured to install a priority static route (priority route) for the slow drain device. Similarly whenever a device comes out of the slow drain condition, the FC switches directly connected to devices zoned with the slow drain device are informed using the same mechanism and are configured to remove/uninstall the priority route (i.e., communication is restored to normal).
In certain arrangements, some of the switches in a fabric may not support zoning or other features. Therefore, in certain examples, the priority routes can be installed on all the supported switches in the fabric without applying any zoning rules. This enables congestion prevention wherever possible in case of mixed topologies.
More specifically, FC forwarding has a concept of priority for each route which acts like a weight. Whenever multiple routes are available for the same destination, the priority route will be used for forwarding and other routes would be ignored. Devices generally have an associated default route which is based on a Fabric Shortest Path First (FSPF) route computation. However, this route can be overridden with a priority route which, once installed, will take precedence over the predetermined FSPF routes. That is, the installed priority route will be selected over the predetermined FSPF routes.
In an FC fabric, routes have a rewrite adjacency field where the desired fields can be overwritten once the destination interface is derived. In accordance the techniques presented herein, a rewrite adjacency for the priority route is provided, where the priority field will be overwritten with the desired priority which will be later used to decide the Virtual Link (VL) for the flows destined to the slow drain device. This priority overrides the access control list (ACL) driven priority also if at all zone quality of service (QoS) is in place. As noted above, in certain examples in which the priority field is preserved over trunked ISL links, the priority route is installed only on the edge switch (i.e., the priority route is required only on the edge switches, and is not required on other hops along the path). However, in other examples where one or more switches do not preserve priority over trunked ISL links, the priority route is installed on all switches in the fabric.
As noted, when a device comes out of a slow drain condition, the priority route would be restored to normal level (i.e., back to the FSPF selected route) and the traffic would resume on the FSPF route. Also as noted, certain FC switches utilize a mechanism where the interface credits are carved per VL and flow control is managed by ER_RDY. Using ER_RDY, a slow flow can be routed to a specified VL (called a low priority VL) such that other good/normal flows will not be affected by the slow device. That is, in accordance with examples presented herein, the slow drain device mitigation techniques allocate fewer buffers for a VL which will be used for slow drain flows. In this way, slow drain devices would be starved more, but won't affect other flows. Installed priority route would make sure traffic towards slow drain device would be flowing through this lower priority VL.
Method 60 begins at 62 where the FC switch 25(2) identifies the target device 20(2) as a slow drain device using, for example, one of the mechanisms described above. At 64, the FC switch 25(2) marks the target device 20(2) as “slow” and propagates this slow drain device information to other switches in the FC fabric (e.g., FC switch 25(1)). In one example, the FC switch 25(2) marks the target device 20(2) as slow by adding the device to a list of determined slow drain devices. The slow drain device information may then be propagated using one of the mechanisms described above and this propagation is generally represented in
At 66, the FC switch 25(2) polls the target device 20(2) to determine whether the target device 20(2) remains in a slow drain device condition. If the target device 20(2) remains in a slow drain device condition, the FC switch 25(2) will wait a period of time and re-poll the target device 20(2). This continues until the FC switch 25(2) determines that the target device 20(2) is no longer in a slow drain condition. In one example, the FC switch 25(2) polls the target device 20(2) as described above (e.g., tracking an amount of time spent waiting for credits (B2B=0) on a port with a configurable timeout threshold). However, the criteria (e.g., configurable timeout threshold) used to determine that the device has transitioned out of slow drain condition may be different (e.g., doubled) and can be tuned as per a selected configuration.
Once the FC switch 25(2) determines that the target device 20(2) is no longer in a slow drain condition, method 60 proceeds to 68 where the FC switch 25(2) marks the target device 20(2) as “normal.” The FC switch 25(2) then propagates additional slow drain device information to other switches in the FC fabric (e.g., to FC switch 25(1)) using the SW_ILS (e.g., SW_ILS 0x73 will carry the information of all the devices moving back to normal mode). The 0x73 is a Cisco® vendor specific SW_ILS command. Cisco® is a registered trademark of Cisco Systems, Inc. This additional slow drain device information is represented in
Method 70 begins at 72 where the FC switch 25(1) receives slow drain device information (e.g., a slow drain device notification 52 or 54) from source FC switch 25(2). At 74, the FC switch 25(1) determines whether the slow drain device information indicates that an identified device has entered a slow drain device condition or that a previously-identified slow drain device has returned to normal operation (i.e., no longer slow draining). If it is determined at 74 that a previously-identified slow drain device has returned to normal operation, then FC switch 25(1) uninstalls a priority route that, as described further below, was previously prepared and installed for the identified device.
If, at 74, the FC switch 25(1) determines that the slow drain device information indicates that an identified device has entered a slow drain condition, then at 78 the FC switch 25(1) determines whether all switches support the ability to perform slow device mitigation. As noted elsewhere herein, not all of the switches in a fabric may run the firmware which has this slow device mitigation feature and/or may not have the hardware support for this feature. If all switches do not support the ability to perform slow device mitigation, then method 70 proceeds to 84, which will be described greater below. However, if all switches do support the ability to perform slow device mitigation, then a determination is made at 80 whether any directly attached computing device interacts with the slow drain device (i.e., with target device 20(2)). If no attached device interacts with the slow drain device, then the method 70 ends at 82 without performing any additional operations.
Returning to 80, if FC switch 25(1) determines that one or more devices directly attached to the FC switch interacts with the slow drain device 20(2), then at 84 the FC switch 25(1) prepares a priority route for the slow drain device 20(2) in order to map the input/output flow to a low-priority VL. The FC switch 25(1) prepares a priority route by identifying a slow VL (i.e., a VL associated with the slow drain device 20(2)), allocating a rewrite adjacency entry, and passing this entry to the hardware. The operations at 84 are also executed when, as noted above, the FC switch 25(1) determines that all switches support the ability to perform slow device mitigation. After preparation of the priority route at 84, the FC switch 25(1) installs the priority route and the method 70 ends at 82.
In summary,
The memory 145 may be read only memory (ROM), random access memory (RAM), magnetic disk storage media devices, optical storage media devices, flash memory devices, electrical, optical, or other physical/tangible memory storage devices. Thus, in general, the memory 145 may comprise one or more tangible (non-transitory) computer readable storage media (e.g., a memory device) encoded with software comprising computer executable instructions and when the software is executed (by the processor 130) it is operable to perform the operations described herein. In other words, the FC switch 125 performs the operations described above in connection with
The FC switch hardware 135 comprises digital logic and other circuitry configured to perform the FC switching operations in an FC fabric. The FC switch hardware 135 may be implemented by one or more application specific integrated circuits (ASICs). The network interface(s) 140 include suitable FC interfaces, such as ports, for connection to an FC network and also to any other network for control/command functions associated with FC switch 125.
As noted above, the techniques presented herein include two primary aspects, namely propagation of slow drain device information though an FC fabric and programming the hardware to divert traffic to a slow drain device to a priority route. In certain examples, the slow drain device information may be propagated using the Name Server infrastructure, which has a relatively quick convergence then other types of communications, such as Zone Server communications, particularly in large customer deployments where the zone configuration may be extensive. In addition, a slow drain device may be identified when a zone change is already in progress. As such, there could be a deadlock situation if multiple switches try to initiate the distribution of slow drain information. The use of Name Server infrastructure to distribute slow drain device information is advantageous in such scenarios since the Name Server infrastructure has the ability to handle these events independently. In one example of the techniques presented herein, an FC switch can install one entry for a slow drain device independent of the number of devices with which it is zoned. The read-write adjacency can be in an ACL, but the number of entries to be reprogrammed will be directly proportional to the number of members zoned with the slow drain device.
In one form, the techniques presented herein provide a computer-implemented method of, at a first Fibre Channel (FC) switch in a FC fabric comprising a plurality of FC switches, receiving an indication that a first computing device attached to the FC fabric has entered a slow drain condition; preparing, at the first FC switch, a priority route for packet flows directed to the first computing device; and installing the priority route at the first FC switch.
In another form, an apparatus is provided. The apparatus comprises one or more network interfaces configured for communication over a Fibre Channel (FC) comprising a plurality of FC switches, a memory, and a processor. The processor is configured to determine, based on a received indication, that a first computing device attached to the FC fabric has entered a slow drain condition, prepare a priority route for packet flows directed to the first computing device, and install the priority route at the apparatus.
In another form, one or more non-transitory computer readable storage media are provided. The computer readable storage media being encoded with software comprising computer executable instructions, and when the software is executed, operable to: at a first Fibre Channel (FC) switch in a FC fabric comprising a plurality of FC switches, receive an indication that a first computing device attached to the FC fabric has entered a slow drain condition; prepare, at the first FC switch, a priority route for packet flows directed to the first computing device; and install the priority route at the first FC switch.
The above description is intended by way of example only. Although the techniques are illustrated and described herein as embodied in one or more specific examples, it is nevertheless not intended to be limited to the details shown, since various modifications and structural changes may be made within the scope and range of equivalents of the claims.
Number | Name | Date | Kind |
---|---|---|---|
6360362 | Fichtner et al. | Mar 2002 | B1 |
8498213 | Gnanasekaran et al. | Jul 2013 | B2 |
8542583 | Gnanasekaran et al. | Sep 2013 | B2 |
8588075 | Gnanasekaran et al. | Nov 2013 | B2 |
8593964 | Sinha et al. | Nov 2013 | B1 |
8593965 | Rongong et al. | Nov 2013 | B2 |
8599691 | Gnanasekaran et al. | Dec 2013 | B2 |
8767561 | Gnanasekaran et al. | Jul 2014 | B2 |
8792354 | Gnanasekaran et al. | Jul 2014 | B2 |
8797877 | Perla | Aug 2014 | B1 |
8908525 | Gnanasekaran et al. | Dec 2014 | B2 |
9264889 | Choi-Grogan et al. | Feb 2016 | B2 |
9268555 | Djabarov et al. | Feb 2016 | B2 |
9411574 | Kostadinov et al. | Aug 2016 | B2 |
9489496 | Wysocki et al. | Nov 2016 | B2 |
9608909 | Bharadwaj et al. | Mar 2017 | B1 |
9830141 | Cairns et al. | Nov 2017 | B2 |
9916737 | Osmon et al. | Mar 2018 | B2 |
9934014 | Diebolt et al. | Apr 2018 | B2 |
10050347 | Wei | Aug 2018 | B2 |
20030076263 | Hassan-Zade et al. | Apr 2003 | A1 |
20040159699 | Nelson et al. | Aug 2004 | A1 |
20050108444 | Flauaus et al. | May 2005 | A1 |
20060190611 | Miyazaki et al. | Aug 2006 | A1 |
20070051809 | Takahashi et al. | Mar 2007 | A1 |
20070121507 | Manzalini | May 2007 | A1 |
20110110232 | Abraham et al. | May 2011 | A1 |
20110110381 | Atkinson | May 2011 | A1 |
20120014253 | Rongong et al. | Jan 2012 | A1 |
20120094599 | Takeyama | Apr 2012 | A1 |
20130260680 | Tsai | Oct 2013 | A1 |
20130309966 | Aldana et al. | Nov 2013 | A1 |
20130343186 | Gnanasekaran et al. | Dec 2013 | A1 |
20140056147 | Gnanasekaran et al. | Feb 2014 | A1 |
20140086054 | Rongong et al. | Mar 2014 | A1 |
20140148095 | Smith et al. | May 2014 | A1 |
20140150056 | Williams et al. | May 2014 | A1 |
20140317614 | Djabarov et al. | Oct 2014 | A1 |
20140375481 | McNicoll | Dec 2014 | A1 |
20150048163 | Senior | Feb 2015 | A1 |
20150052512 | Kostadinov et al. | Feb 2015 | A1 |
20150178064 | Cairns et al. | Jun 2015 | A1 |
20150235204 | Wallner | Aug 2015 | A1 |
20150303994 | Dhayni | Oct 2015 | A1 |
20150339649 | Pi Farias | Nov 2015 | A1 |
20160006123 | Li et al. | Jan 2016 | A1 |
20160054989 | Diebolt et al. | Feb 2016 | A1 |
20160111770 | Choi et al. | Apr 2016 | A1 |
20160182127 | Karandikar et al. | Jun 2016 | A1 |
20160218432 | Pope et al. | Jul 2016 | A1 |
20160306616 | Tomppo | Oct 2016 | A1 |
20160323428 | Kim et al. | Nov 2016 | A1 |
20170357961 | Bidari et al. | Dec 2017 | A1 |
20180005224 | Binder et al. | Jan 2018 | A1 |
20180068300 | Saeed et al. | Mar 2018 | A1 |
20190004785 | Kelley et al. | Jan 2019 | A1 |
Number | Date | Country |
---|---|---|
109314545 | Feb 2019 | CN |
109644021 | Apr 2019 | CN |
1260947 | Nov 2002 | EP |
2017214349 | Dec 2017 | WO |
2018005468 | Jan 2018 | WO |
2019005386 | Jan 2019 | WO |
Entry |
---|
Brinkmann, M., “Windows 10 Update Delivery Optimization explained,” Windows, dated Aug. 17, 2016, Retrieved from the Internet URL: https://web.archive.org/web/20161110094241/https://www.ghacks.net/2016/08/17/windows-10-update-delivery-optimization/, pp. 1-6. |
“FLEX-M24LRO4E 45 mm x 75 mm flexible antenna reference board for the M24LRO4E-R Dual Interface EEPROM,” St Microelectronics, published Jul. 2, 2012, Retrieved from the Internet URL : http://www.st.com/content/ccc/resource/technical/document/data_brief/2d/55/62/05/58/fd/42/e2/DM00058583.pdf/files/DM00058583.pdf/jcr:content/translations/en.DM00058583.pdf, on Sep. 26, 2017, pp. 1-3. |
Negron, J., “[Technical Blueprint] Windows 10 Co-Management with SCCM & Workspace ONE,” VMware End-User Computing Blog, dated Apr. 16, 2018, Retrieved from the Internet URL: http://aponewsletter.blogspot.com/2018/04/technical-blueprint-windows-10-co.html, pp. 1-9. |
“Software Update Server,” Technology Brief, Retrieved from the Internet URL: https://www.apple.com/server/docs/Software_Update_Server_TB_v10.4.pdf, pp. 1-3 (Jun. 16, 2005). |
Non-Final Office Action dated Apr. 19, 2018, for U.S. Appl. No. 15/176,589, of Bidari, E., et al., filed Jun. 8, 2016. |
Non-Final Office Action dated Jul. 25, 2018, for U.S. Appl. No. 15/636,356, of Kelley, J., et al., filed Jun. 28, 2017. |
International Search Report and Written Opinion for International Application No. PCT/US2016/050036, dated Nov. 29, 2016. |
International Search Report and Written Opinion for International Application No. PCT/US2017/039466, dated Oct. 5, 2017. |
International Search Report and Written Opinion for International Application No. PCT/US2017/036473, dated Oct. 13, 2017. |
International Search Report and Written Opinion for International Application No. PCT/US2018/034773, dated Aug. 10, 2018. |
Final Office Action dated Oct. 25, 2018, for U.S. Appl. No. 15/176,589, of of Bidari, E., et al., filed Jun. 8, 2016. |
Final Office Action dated Jan. 4, 2019, for U.S. Appl. No. 15/636,356, of Kelley, J., et al., filed Jun. 28, 2017. |
Advisory Action dated Jan. 7, 2019, for U.S. Appl. No. 15/176,589, of Bidari, E., et al., filed Jun. 8, 2016. |
Notice of Allowance dated Jan. 18, 2019, for U.S. Appl. No. 15/197,720, of Binder, J., et al., filed Jun. 29, 2016. |
Non-Final Office Action dated Jun. 24, 2019, for U.S. Appl. No. 16/397,836, of Binder, J., et al., filed Apr. 29, 2019. |
Non-Final Office Action dated Jun. 25, 2019, for U.S. Appl. No. 15/721,663, of Gallinghouse, D., et al., filed Sep. 29, 2017. |
Non-Final Office Action dated Jun. 28, 2019, for U.S. Appl. No. 15/176,589, of Bidari, E., et al., filed Jun. 8, 2016. |
Non-Final Office Action dated Jul. 15, 2019, for U.S. Appl. No. 15/636,356, of Kelley, J., et al., filed Jun. 28, 2017. |
Non-Final Office Action dated Mar. 14, 2019 for U.S. Appl. No. 15/636,356, of Kelley, J. filed Jun. 28, 2017. |
Number | Date | Country | |
---|---|---|---|
20180063004 A1 | Mar 2018 | US |