The present invention relates generally to communication networks, and particularly to methods and systems for updating of network device software.
Network devices used in communication networks, such as packet switches and routers, commonly comprise processors or other programmable devices that run software and/or firmware. The software and/or firmware in network devices may, inter alia, control hardware devices such as Field-Programmable Gate Arrays) (FPGAs) and Application-Specific Integrated Circuits (ASICs). It is highly desirable to update the software or firmware without having to disrupt the normal operation of the network device, at least in the data plane. A software or firmware update of this sort is referred to as “In-Service Software Update” (ISSU).
An embodiment that is described herein provides a controller including one or more ports and a processor. The one or more ports are to communicate with a network that includes multiple network devices. The processor is to receive, from a network device in the network, a request to perform a software update in the network device, to evaluate a permission condition in response to the request, to send to the network device a response granting the request when the permission condition is met, and to at least temporarily deny the request when the permission condition is not met.
In some embodiments, the network is a Software-Defined Network (SDN). In an embodiment, the network is an InfiniBand (IB) network, and the controller is a Subnet Manager (SM) in the IB network. In an embodiment, the software update is an In-Service Software Update (ISSU). In some embodiments, the processor is to at least temporarily deny the request by (i) sending to the network device a notification denying the request, or (ii) temporarily refraining from granting the request, thereby deferring granting the request to a later time.
In a disclosed embodiment, in accordance with the permission condition, the processor is to temporarily deny the request when a temporary interruption in the control-plane operation is intolerable, and to grant the request when the temporary interruption in the control-plane operation is tolerable. In another embodiment, in accordance with the permission condition, the processor is to temporarily deny the request upon finding that the network device provides backup to another network device.
In yet another embodiment, in accordance with the permission condition, the processor is to decide whether or not to grant the request based at least on a state of at least one other network device in the network. In still another embodiment, in accordance with the permission condition, the processor is to temporarily deny the request when more than a specified number of other network devices in the network are currently undergoing software updates. In an example embodiment, in accordance with the permission condition, the processor is to temporarily deny the request when a bandwidth degradation, over the network or over a region of the network, exceeds a permitted bandwidth degradation.
There is additionally provided, in accordance with an embodiment that is described herein, a network device including multiple ports to send and receive packets over the SDN, packet processing circuitry to process the packets, and a processor. The processor is to receive an instruction to perform a software update in the network device, to send, in response to the instruction, a request to a controller of the network, requesting permission to perform the software update, and to defer performing the software update until receiving a response from the controller granting the request.
In some embodiments, the processor is to receive the instruction to perform the software update from a Network Management System (NMS) that is separate from the SDN controller.
There is further provided, in accordance with an embodiment that is described herein, a method for software updating. The method includes, in a controller, which controls a network that includes multiple network devices, receiving a request, from a network device in the network, to perform a software update in the network device. A permission condition is evaluated in the controller in response to the request. A response granting the request is sent from the controller to the network device when the permission condition is met. The request is at least temporarily denied when the permission condition is not met.
There is also provided, in accordance with an embodiment that is described herein, a method for software updating. The method includes, in a network device that sends and receives packets over a network and processes the packets, receiving an instruction to perform a software update in the network device. A request is sent to a controller of the network: in response to the instruction, requesting permission to perform the software update. Performing the software update is deferred until receiving a response from the SDN controller granting the request.
The present invention will be more fully understood from the following detailed description of the embodiments thereof, taken together with the drawings in which:
Overview
Embodiments of the present invention that are described herein provide improved techniques for performing and controlling software updating in network devices. The embodiments described herein refer mainly to Software-Defined Networks (SDNs), but the disclosed techniques can be used in various other network types. The embodiments described herein refer mainly to In-Service Software Updating (ISSU) in InfiniBand™ (IB) networks, by way of example. The disclosed techniques, however, are also useful for controlling any other type of software update in any other type of network.
In the present context, the term “software” refers to any type of programmable code that runs in a network device, such as software running on a processor or firmware programmed in a programmable logic device. The term “software update”, in its various grammatical forms, refers to any type of change in the software, including, example, upgrading or downgrading of software for versions, installation of software patches, and the like. The term “In-Service Software Update” (ISSU) refers to a software update that does not involve resetting the network device or otherwise disrupting the network device's data-plane operations. ISSU typically does allow a temporary disruption of control-plane operations.
A typical SDN is controlled by an SDN controller and managed by a Network Management System (NMS). The operations handled by the SDN controller are referred to as “control-plane” operations, and operations handled by the NMS are referred to as “management-plane” operations. Software updates are conventionally considered management-plane operations and are therefore managed by the NMS.
When using the traditional “division of labor” between the NMS and the SDN controller, the information available to the NMS may not be sufficient for optimally deciding when to update software (and/or when to avoid updating software) in a given network device. Time periods in which software updating is best avoided include, for example, periods in which the network device serves as a backup for another network device, or periods in which multiple other network devices are already undergoing software update. Information of this sort is typically available to the SDN controller. The SDN controller, however, is conventionally not involved in software updating.
In some embodiments that are described herein, both the NMS and the SDN controller participate in deciding when to update software in a given network device. In a disclosed embodiment, upon receiving an instruction from the NMS to perform a software update, the network device does not immediately perform the update as instructed. Instead, the network device sends a request to the SDN controller, requesting permission to perform the software update initiated by the NMS. The SDN controller may grant or deny the request, at least temporarily. Denying the request can be performed by sending an explicit denial notification to the network device, or by refraining from granting the request, thereby effectively deferring the grant to a later time.
In various embodiments, the SDN controller may evaluate various conditions (referred to herein as “permission conditions”) for deciding whether and/or when to permit a network device to perform software update. Several non-limiting examples of permission conditions are described herein.
The techniques described herein exploit the information available to the SDN controller to reduce the performance degradation caused by software updates in network devices. At the same time, the disclosed techniques maintain the traditional division of responsibilities between the SDN controller and the NMS, and are therefore simple to implement in existing and emerging SDN protocols.
System Description
Network 28 comprises multiple network devices 32, in the present example InfiniBand (IB) switches. As noted above, IB is regarded herein as a non-limiting example of a Software-Defined Network (SDN). In alternative embodiments, network 28 and network devices 32 may operate in accordance with any other suitable SDN protocol, e.g., a protocol defined over Ethernet. Network devices 32 may alternatively comprise routers, bridges, gateways, or any other suitable type of network devices.
A given network device 32 typically comprises multiple ports 36, packet processing circuitry 40, and a memory 48. (The internal structure is depicted in the figure only for one of network devices 32, for the sake of clarity. The other network devices typically have a similar internal structure.) Ports 36 are used for sending and receiving packets to and from network 28. Packet processing circuitry 40 processes the packets, e.g., forwards each incoming packet to a suitable egress port. Processor 44 configures, manages and controls the operation of network device 32.
Memory 48 is used for storing any relevant information used by the network device. Among other data, memory 48 stores software code (SW) 52 of processor 44. Among other tasks, processor 44 updates SW 52 using ISSU techniques that are disclosed herein. More generally, the software being updated may comprise, for example, any software and/or firmware running in network device 32, e.g., in processor 44 and/or processing circuitry 40.
System 20 further comprises an IB Subnet Manager (SM) 56, which controls the operation of network devices 32 and of network 28 in general. As noted above, SM 56 is regarded herein as a non-limiting example of an SDN controller. In the present example, SM 56 comprises one or more ports 60 for communicating with network 28 (e.g., with network devices 32), and a processor 64 that carries out the various computing tasks of SM 56. Among other tasks, processor 64 participates in the disclosed ISSU processes, as will be described in detail below.
System 20 additionally comprises a Network Management System (NMS) 68. NMS 68 is separate from SM 56 and has separate responsibilities and tasks. Among other tasks, NMS 68 initiates ISSU processes in network devices 32, and provides the network devices with the software to be updated. NMS 68 may use any suitable criterion or policy for choosing which network devices to update using ISSU and when. In one non-limiting example, the NMS may identify (e.g., by communicating with other management systems that implement service delivery) a group of network devices, which serve a certain job or customer that is currently idle. The NMS may prefer to perform ISSU in this group of network devices.
The configurations of system 20, including the internal configurations of network devices 32 and SM 56, as shown in
The various elements of system 20, including network devices 32 and SM 56, may be implemented in hardware, e.g., in one or more Application-Specific Integrated Circuits (ASICs) or FPGAS, in software, or using a combination of hardware and software elements. In some embodiments, processor 44 and/or processor 64 may be implemented, in part or in full, using one or more general-purpose processors, which are programmed in software to carry out the functions described herein. The software may be downloaded to any of the processors in electronic form, over a network, for example, or it may, alternatively or additionally, be provided and/or stored on non-transitory tangible media, such as magnetic, optical, or electronic memory.
ISSU Controlled by SM
The method begins with processor 44 receiving (e.g., via one of ports 36) an instruction from NMS 68 to perform ISSU, at an instruction reception stage 80. The instruction may be accompanied by the actual updated software version (e.g., version SW 52) to be of installed. Alternatively, the updated software version may be downloaded to network device 32 in advance.
At a requesting stage 84, processor 44 sends (e.g., via one of ports 36) a request to SM 56, requesting permission to perform the ISSU. At a grant checking stage 88, processor 44 checks whether the request is granted or not. Processor 44 does not proceed with the ISSU process until receiving a grant from SM 56. Once the request is granted by SM 56 (e.g., via one of ports 36), processor 44 updates the software as instructed, at an ISSU stage 92.
The method begins with processor 64 receiving (via a port 60) a request from a given network device 32 to perform ISSU, at a request reception stage 100. At a permission checking stage 104, processor 64 evaluates a permission condition for deciding whether to permit or (at least temporarily) deny the request.
In various embodiments, processor 64 may evaluate various permission conditions. Some permission conditions may depend only on the state of the requesting network device. Other permission conditions may depend on the states of one or more other network devices 32 in network 28. Several non-limiting examples of permission conditions include the following:
Additionally or alternatively, processor 64 may evaluate any other suitable permission condition.
At a permission checking stage 108, processor 64 checks whether the requested ISSU should be permitted in accordance with the applicable permission condition. If the permission condition indicates that permission is to be granted, processor 64 sends (via a port 60) a grant notification to the requesting network device 32, at a granting stage 112.
If the permission condition indicates that permission is to be denied, processor 64 denies the request, at a denial stage 116. In various embodiments, denial may be explicit of implicit. In an explicit denial, processor 64 sends (via a port 60) a denial notification, informing the requesting network device 32 that ISSU is temporarily on hold. In an implicit denial, processor 64 does not send any response to the requesting network device, thereby forcing the network device to put the ISSU on hold until receiving a grant.
The flows of
In some embodiments, a software update needs to be performed across the entire system 20, including SM 56. In these embodiments, the software of SM 56 (e.g., software of processor 64) is updated first, and the software updates are initiated in network devices 32.
In some embodiments, the functionality of SM 56 is hosted on one of network devices 32. To perform ISSU on the hosting network device, SM 56 is typically relocated first (e.g., to another network device, using a suitable SDN high-Availability (HA) process). Only then ISSU is performed in the network device.
Typically, a network device 32 that performs ISSU notifies SM 56 upon beginning and upon completing the ISSU process. SM 56 typically maintains a list of “excluded network devices”-Network devices that are currently undergoing ISSU. The network devices on this such as re-routing, list are excluded from processes construction of multicast trees, traffic optimization operations such as traffic reduction or aggregation, etc. Such operations are addressed, for example, in U.S. Pat. Nos. 10,284,383, 10,419,329 and 11,252,027 and U.S. Patent Application Publication 2020/0106828. After ISSU is completed, SM 56 removes the network device from the “excluded” list. At this point SM 56 may also update the configuration of the network device in question, e.g., update new network device capabilities.
Although the embodiments described herein mainly address software updating in data centers and HPCs, the methods and systems described herein can also be used in other networks, SDNs or otherwise, such as in mobile networks and Virtual Private Networks (VPNs).
It will thus be appreciated that the embodiments described above are cited by way of example, and that the present invention is not limited to what has been particularly shown and described hereinabove. Rather, the scope of the present invention includes both combinations and sub-combinations of the various features described hereinabove, as well as variations and modifications thereof which would occur to persons skilled in the art upon reading the foregoing description and which are not disclosed in the prior art. Documents incorporated by reference in the present patent application are to be considered an integral part of the application except that to the extent any terms defined in these incorporated documents in a manner that conflicts with the definitions made explicitly or implicitly in the present specification, only the definitions in the present specification should be considered.
| Number | Name | Date | Kind |
|---|---|---|---|
| 6070012 | Eitner et al. | May 2000 | A |
| 6397385 | Kravitz | May 2002 | B1 |
| 6535924 | Kwok et al. | Mar 2003 | B1 |
| 6640334 | Rasmussen | Oct 2003 | B1 |
| 7609617 | Appanna et al. | Oct 2009 | B2 |
| 7661025 | Banks et al. | Feb 2010 | B2 |
| 7774438 | Zilbershtein | Aug 2010 | B2 |
| 8027248 | Balasubramanian | Sep 2011 | B2 |
| 8068409 | Kumaresan | Nov 2011 | B2 |
| 8190720 | Yellai et al. | May 2012 | B1 |
| 8194642 | Rosenberg et al. | Jun 2012 | B2 |
| 8219794 | Wang et al. | Jul 2012 | B1 |
| 8346913 | Gao et al. | Jan 2013 | B2 |
| 8364843 | Hanselmann | Jan 2013 | B2 |
| 8402453 | Gupta et al. | Mar 2013 | B2 |
| 8499060 | Narayanan | Jul 2013 | B2 |
| 8570877 | Bayar | Oct 2013 | B1 |
| 8627137 | Vaidya | Jan 2014 | B1 |
| 8705349 | Bloch | Apr 2014 | B2 |
| 8745614 | Banerjee et al. | Jun 2014 | B2 |
| 8782632 | Chigurapati et al. | Jul 2014 | B1 |
| 8799422 | Qu | Aug 2014 | B1 |
| 8943489 | Qu | Jan 2015 | B1 |
| 9021459 | Qu | Apr 2015 | B1 |
| 9030947 | Xu | May 2015 | B2 |
| 9049148 | Singh | Jun 2015 | B1 |
| 9088584 | Feng et al. | Jul 2015 | B2 |
| 9131014 | Kasat | Sep 2015 | B2 |
| 9177122 | Trier | Nov 2015 | B1 |
| 9182972 | Hanselmann | Nov 2015 | B2 |
| 9246702 | Sharma | Jan 2016 | B1 |
| 9455894 | Neelam | Sep 2016 | B1 |
| 9491107 | Scudder | Nov 2016 | B1 |
| 9769017 | Jose | Sep 2017 | B1 |
| 9846658 | Tatar | Dec 2017 | B2 |
| 9870219 | Manthiramoorthy et al. | Jan 2018 | B1 |
| 9935834 | Baveja | Apr 2018 | B1 |
| 10003498 | Shevenell | Jun 2018 | B2 |
| 10033631 | Baveja | Jul 2018 | B1 |
| 10079725 | Baveja | Sep 2018 | B1 |
| 10083026 | Venkata | Sep 2018 | B1 |
| 10084895 | Kasat | Sep 2018 | B2 |
| 10103995 | Baveja | Oct 2018 | B1 |
| 10164829 | Watson | Dec 2018 | B1 |
| 10200274 | Suryanarayana | Feb 2019 | B1 |
| 10284383 | Bloch et al. | May 2019 | B2 |
| 10419329 | Levi et al. | Sep 2019 | B2 |
| 10452386 | Kulchytsky et al. | Oct 2019 | B1 |
| 10608893 | Di Martino | Mar 2020 | B2 |
| 10721333 | Spear | Jul 2020 | B2 |
| 10764140 | Ozawa | Sep 2020 | B2 |
| 10824501 | Itkin et al. | Nov 2020 | B2 |
| 10838711 | Haramaty et al. | Nov 2020 | B2 |
| 10884728 | A | Jan 2021 | B2 |
| 10911508 | Jones | Feb 2021 | B2 |
| 10972402 | Akash | Apr 2021 | B1 |
| 10984107 | Itkin | Apr 2021 | B2 |
| 11012731 | Jones | May 2021 | B2 |
| 11082317 | Zhang | Aug 2021 | B2 |
| 11153194 | Roberts | Oct 2021 | B2 |
| 11252027 | Ben-Moshe et al. | Feb 2022 | B2 |
| 11321077 | Sakthikumar | May 2022 | B1 |
| 11405272 | Khan | Aug 2022 | B1 |
| 11489724 | Deshmukh | Nov 2022 | B1 |
| 11570116 | Seth | Jan 2023 | B1 |
| 11640291 | A | May 2023 | B2 |
| 11741232 | Sfadia et al. | Aug 2023 | B2 |
| 11778012 | Jones | Oct 2023 | B2 |
| 11792069 | Deshmukh | Oct 2023 | B2 |
| 11900096 | Mahishi | Feb 2024 | B2 |
| 11922162 | A | Mar 2024 | B2 |
| 11962507 | Seth | Apr 2024 | B1 |
| 11973648 | Mahishi | Apr 2024 | B2 |
| 11979286 | Koundinya | May 2024 | B1 |
| 12001835 | Rojas Fonseca | Jun 2024 | B2 |
| 12020019 | Rojas Fonseca | Jun 2024 | B2 |
| 20020092008 | Kehne et al. | Jul 2002 | A1 |
| 20030028800 | Dayan et al. | Feb 2003 | A1 |
| 20030188176 | Abbondanzio et al. | Oct 2003 | A1 |
| 20040024860 | Sato et al. | Feb 2004 | A1 |
| 20040042547 | Coleman | Mar 2004 | A1 |
| 20040083476 | Zhou et al. | Apr 2004 | A1 |
| 20040131115 | Burgess et al. | Jul 2004 | A1 |
| 20050021968 | Zimmer et al. | Jan 2005 | A1 |
| 20050114846 | Banks et al. | May 2005 | A1 |
| 20050114894 | Hoerl | May 2005 | A1 |
| 20050125519 | Yang et al. | Jun 2005 | A1 |
| 20060233182 | Appanna et al. | Oct 2006 | A1 |
| 20070174685 | Banks et al. | Jul 2007 | A1 |
| 20070179957 | Gibson | Aug 2007 | A1 |
| 20070183493 | Kimpe | Aug 2007 | A1 |
| 20070192610 | Chun et al. | Aug 2007 | A1 |
| 20070300207 | Booth et al. | Dec 2007 | A1 |
| 20080126541 | Rosenberg et al. | May 2008 | A1 |
| 20080165952 | Smith et al. | Jul 2008 | A1 |
| 20080195693 | Gao et al. | Aug 2008 | A1 |
| 20090063108 | De Atley et al. | Mar 2009 | A1 |
| 20090089774 | Lynch et al. | Apr 2009 | A1 |
| 20090199049 | Yorimitsu | Aug 2009 | A1 |
| 20100058306 | Liles et al. | Mar 2010 | A1 |
| 20100199272 | Mahajan et al. | Aug 2010 | A1 |
| 20120072734 | Wishman et al. | Mar 2012 | A1 |
| 20120072893 | Gupta et al. | Mar 2012 | A1 |
| 20120166781 | De Cesare et al. | Jun 2012 | A1 |
| 20120210115 | Park et al. | Aug 2012 | A1 |
| 20120291021 | Banerjee et al. | Nov 2012 | A1 |
| 20130024677 | Smith et al. | Jan 2013 | A1 |
| 20130036298 | De Atley et al. | Feb 2013 | A1 |
| 20130047031 | Tabone et al. | Feb 2013 | A1 |
| 20130145359 | Hanselmann | Jun 2013 | A1 |
| 20130155902 | Feng et al. | Jun 2013 | A1 |
| 20130219156 | Sears | Aug 2013 | A1 |
| 20130254906 | Kessler et al. | Sep 2013 | A1 |
| 20130262612 | Langas et al. | Oct 2013 | A1 |
| 20140047174 | Sakthikumar et al. | Feb 2014 | A1 |
| 20140189673 | Stenfort et al. | Jul 2014 | A1 |
| 20140317350 | Langas et al. | Oct 2014 | A1 |
| 20150058979 | Peeters et al. | Feb 2015 | A1 |
| 20160266894 | Panicker et al. | Sep 2016 | A1 |
| 20170063539 | Balakrishnan et al. | Mar 2017 | A1 |
| 20170147356 | Kotary et al. | May 2017 | A1 |
| 20170161483 | Li et al. | Jun 2017 | A1 |
| 20170168803 | Katyar et al. | Jun 2017 | A1 |
| 20170346631 | De Atley et al. | Nov 2017 | A1 |
| 20180067800 | Gusev et al. | Mar 2018 | A1 |
| 20200026427 | Peleg et al. | Jan 2020 | A1 |
| 20200106828 | Elias et al. | Apr 2020 | A1 |
| 20200257521 | Jayakumar et al. | Aug 2020 | A1 |
| 20200310784 | Krishnan | Oct 2020 | A1 |
| 20200326925 | Nachimuthu et al. | Oct 2020 | A1 |
| 20210081365 | Conley | Mar 2021 | A1 |
| 20210211281 | Park et al. | Jul 2021 | A1 |
| 20210240489 | Xie et al. | Aug 2021 | A1 |
| 20220156377 | Xie et al. | May 2022 | A1 |
| 20220171649 | Green | Jun 2022 | A1 |
| 20220182433 | Jones | Jun 2022 | A1 |
| 20230075108 | Subramaniam | Mar 2023 | A1 |
| Number | Date | Country |
|---|---|---|
| 3176723 | Jun 2017 | EP |
| Entry |
|---|
| U.S. Appl. No. 18/349,147 Office Action dated Feb. 23, 2024. |
| PCI Express® Base Specification, Revision 4.0, Version 0.3 , pp. 1-1053, Feb. 19, 2014. |
| Unified Extensible Firmware Interface (UEFI) Specification , Version 2.7—Errata A , Chapter 31, pp. 1765-1798, Aug. 2017. |
| Implementation Guidance for FIPS 140-2 and the Cryptographic Module Validation Program, National Institute of Standards and Technology Communications Security Establishment, pp. 1-237, Mar. 28, 2003. |
| FIPS PUB 140-2—“Security Requirements for Cryptographic Modules”, pp. 1-69, May 25, 2001. |
| PKCS#1—Cryptography Standard, Version 2.2, published by RSA Laboratories , pp. 1-63, Oct. 27, 2012. |
| FIPS PUB 180-4—“Secure Hash Standard (SHS)”, pp. 1-36, Aug. 2015. |
| FIPS PUB 198-1—“The Keyed-Hash Message Authentication Code (HMAC)”, pp. 1-13, Jul. 2008. |
| Wikipedia, “Firmware” , pp. 1-6, Jul. 23, 2019. |
| Tremaine et al., “Pinnacle: IBM MXT in a memory controller chip,” IEEE Micro, vol. 21, No. 2, pp. 56-68, Mar.-Apr. 2001. |
| Brocade, “Network OS 7.0.1 for Brocade VDX”, Release Notes v4.0, pp. 1-199, Aug. 24, 2016. |
| Anonimous Authors, “Method of Verifying Dynamic Firmware Update Prior to Promotion,” IP.com Electronic Publication, pp. 1-5, Sep. 10, 2013. |
| Sfadia et al., U.S. Appl. No. 18/349,147, filed Jul. 9, 2023. |
| U.S. Appl. No. 18/349,147 Office Action dated May 24, 2024. |
| Number | Date | Country | |
|---|---|---|---|
| 20250106211 A1 | Mar 2025 | US |