Some Information Technology departments in corporations have started building their computer infrastructure to be, as much as possible, defined by software. This software-defined infrastructure sometimes relies on a hyperconverged infrastructure (HCl) where different functional components are integrated into a single device. One aspect of an HCl is that components of hardware may be virtualized into software defined, and logically isolated representations of computing, storage, and networking for a computer hardware infrastructure. HCl and virtualization of hardware resources may allow the allocation of computing resources to be flexible. For example, configuration changes may be applied to the infrastructure and the underlying hardware simply adapts to a new software implemented configuration. HCl may further be used by some corporations to implement a virtualized computer by completely defining the computer's capability specification in software. Each virtualized computer (e.g., defined by software) may then utilize a portion of one or more physical computers (e.g., the underlying hardware). One recognized result of virtualization is that physical computing, storage, and network capacity may be more efficiently utilized across an organization.
NVM Express (NVMe) is a data transfer protocol typically used to communicate with Solid-State Drives (SSDs) over a Peripheral Component Interconnect Express (PCIe) communication bus. There are many different types of data transport protocols that exist for different uses within computer systems. Each different transport protocol may exhibit different characteristics with respect to speed and performance and therefore each protocol may be applicable for different uses. NVMe is an example of a data protocol that may be used to enable high-speed data transfer between a host computer system and an SSD. NVMe is commonly used in computers that desire high-performance read and write operations to an SSD. Utilizing NVMe disks capable of supporting high-performance read and write within a software defined infrastructure further utilizing HCl hardware may represent a useful and adaptable configuration for infrastructure networks.
A specification has been developed for running NVMe over fabrics (NVMe-oF). One goal of this specification was extending NVMe onto fabrics such as Ethernet, Fibre Channel, and InfiniBand or any other suitable storage fabric technology. Access to SSD disks over network fabrics via NVMe-oF may allow software defined storage capacity (e.g., portions of a larger hardware storage capacity) to scale for access. This scaling for access may: a) allow access to a large number of NVMe devices; and b) extend a physical distance between devices (e.g., within a datacenter). Scaling may include increasing distances over which NVMe storage devices may be accessed by another computing device. Storage protocols are typically lossless protocols because of the nature of storage goals. If a protocol used for storage is lossy (lossy is the opposite of lossless), proper storage of data is likely going to exhibit unacceptable slowness (e.g., due to packet transmission retries) or even worse may present corruption (e.g., data inaccuracies) and therefore not be useable within a real-world computer environment. NVMe-oF traffic may be used to provide storage for other network devices and thus rely on configuration to establish tunnels or other communication paths between remote devices and network storage devices. This configuration may allocate portions of an HCl infrastructure device to be “assigned” to a remote device. Configurations may change over time as more remote devices come on-line and other remote devices release resources that are no longer needed (e.g., because of an application termination, failure of a remote device, or other reasons). In another example, some remote devices may simply desire to augment storage allocation to increase their overall storage capacity.
The present disclosure may be better understood from the following detailed description when read with the accompanying Figures. It is emphasized that, in accordance with standard practice in the industry, various features are not drawn to scale. In fact, the dimensions or locations of functional attributes may be relocated or combined based on design, security, performance, or other factors known in the art of computer systems. Further, order of processing may be altered for some functions, both internally and with respect to each other. That is, some functions may not require serial processing and therefore may be performed in an order different than shown or possibly in parallel with each other. For a detailed description of various examples, reference will now be made to the accompanying drawings, in which:
Illustrative examples of the subject matter claimed below will now be disclosed. In the interest of clarity, not all features of an actual implementation are described for every example implementation in this specification. It will be appreciated that in the development of any such actual example, numerous implementation-specific decisions may be made to achieve the developer's specific goals, such as compliance with system-related and business-related constraints, which will vary from one implementation to another. Moreover, it will be appreciated that such a development effort, even if complex and time-consuming, would be a routine undertaking for those of ordinary skill in the art having the benefit of this disclosure.
NVM Express (NVMe) over fabric (NVMe-oF) data access packets are used for lossless communication between a remote device and a storage device. Disclosed techniques identify, through automatic discovery, that a remote device has an access path to the storage device that is not yet configured for that remote device. Based on the discovery, software within a switch may automatically configure one or both of a virtual local area network connectivity or a tunnel over Internet protocol (IP) to provide access for the remote device to communicate to the storage device. In general, all necessary configuration to allow a “plug-n-play” allocation of resources may be automatically performed between a client application and the remote device requesting (or de-allocating) storage resources. Historically, all switch configurations to support NVMe-oF storage devices were at the direction of a system administrator and performed manually. Accordingly, disclosed techniques represent an improvement to the art of system administrations of network storage, in part, by allowing automated configuration to replace operations that were previously performed manually.
As briefly mentioned above, this disclosure describes an improvement over the previously provided methods that may be dependent upon frequent configuration changes (sometimes manual) to the network infrastructure devices. According to disclosed implementations, NVMe-oF network communication paths may be automatically established to allow access to storage provided by a network infrastructure device. For example, storage (e.g., SSDs) may be included in a network switch and client applications executing on the network switch may facilitate (e.g., by automatically establishing proper communication paths and protocols) access to the storage based on configuration attributes of the remote device. There may be different underlying formats of data transfer for NVMe with the recognized abbreviation for NVMe over PCIe being “NVMe/PCIe.” NVMe over Fabrics, when used agnostically with respect to the transport, is abbreviated “NVMe-oF.” NVMe over remote direct memory access (RDMA) is abbreviated “NVMe/RDMA.” NVMe over Fibre Channel is abbreviated “NVMe/FC” and NVMe over transport control protocol (TCP) is abbreviated “NVMe/TCP.” As other protocols are associated with NVMe, it is expected that other abbreviations may be defined. As will be apparent to those of ordinary skill in the art, given the benefits of this disclosure, the techniques of this disclosure are applicable to existing and future implementations of transports that may be used in a like manner to the examples of this disclosure.
As used herein, a “client application” executing on a network infrastructure device, such as a switch that incorporates storage, may be implemented using software, firmware, hardware logic (e.g., silicon-based logic), or a combination of these techniques. In general, the combination of functional modules may perform collectively to support a plug-n-play type allocation of resources such that resources (e.g., storage) are automatically allocated or de-allocated for remote devices based on discovery or on-demand requests. Discovery may allow automatic detection of a remote device associating itself to a network (e.g., boot-up, restart, fail-over, etc.). Upon discovery, configuration information, that may have been stored previously, may be exchanged between the remote device and the network infrastructure device to establish allocation of resources. Discovery may also determine a proper protocol to use for communications between the remote device and the network infrastructure device. Different remote devices may use different protocols to concurrently communicate with a single network infrastructure device. Discovery may also detect that resources are no longer in-use by a remote device and reclaim previously allocated resources such that those resources may be returned to a resource pool and made available for other allocation requests.
Referring now to
Control plane 110, for example, in a router may be used to maintain routing tables (or a single comprehensive routing table) that list which route should be used to forward a data packet, and through which physical interface connection (e.g., output ports 160 through 169). Control plane 110 may perform this function by using internal preconfigured directives, called static routes, or by learning routes dynamically using a routing protocol. Static and dynamic routes may be stored in one or more of the routing tables. The control-plane logic may then strip non-essential directives from the table and build a forwarding information base (FIB) to be used by data plane 115.
A router may also use a forwarding plane (e.g., part of the data plane 115) that contains different forwarding paths for information from different ports or different destination addresses (e.g., forwarding path A 116 or forwarding path Z 117). In general, The router forwards data packets between incoming (e.g., ports 150-159) and outgoing interface connections (e.g., ports 160-169). The router forwards data packets to the correct network type using information that the packet header contains matched to entries in the FIB supplied by control plane 110. Ports are typically bidirectional and are shown in this example as either “input” or “output” to illustrate flow of a message through a routing path. In some network implementations, a router (e.g., switch/router 100) may have interfaces for different types of physical layer connections, such as copper cables, fiber optic, or wireless transmission. A single router may also support different network layer transmission standards. Each network interface may be used to enable data packets to be forwarded from one transmission system to another. Routers may also be used to connect two or more logical groups of computer devices known as subnets, each with a different network prefix.
Also illustrated in
Control plane 110, as illustrated in
Many different configuration settings for both the software and the device itself are possible and describing each is beyond the scope of this disclosure, However, the disclosed automatic detection and allocation of storage on behalf of remote devices (e.g., automatic system provisioning) may be implemented in one or more functional components of network infrastructure device such as switch/router 100. Configuration settings may be stored or provided just-in-time to be used to allocate resources for remote devices, establish communication tunnels, provide security for data exchange, and provide other automatic provisioning functions in support of remote devices. Each of these automatic provisioning functions may be incorporated into the one or more functional components illustrated for network infrastructure device (e.g., switch/router 100). Further, in some implementations such as shown in
Continuing with
Referring now to
High-availability switch 200A also includes a plurality of communication cards (e.g., Card Slot 1 (221), Card Slot 2 (222), Card Slot 3 (223), and Card Slot N (225)) that may each have a plurality of communication ports configured to support network communication. A card slot, such as Card Slot 1 (221) may also be referred to as a “line card” and have a plurality of bi-directional communication ports (as well as a management port (not shown)). Card Slot 1 (221) is illustrated with port 1-1 (241) and port 1-2 (242) and may represent a “card” that is plugged into a slot (e.g., communication bus connection) of a backplane (e.g., communication bus) of high-availability switch 200A. Other connections and connection types are also possible (e.g., cable connection, NVMe device), Also, in
To support communications between a controller (e.g., an active and/or a standby controller) in a switch and client devices (e.g., remote devices) connected to that switch, a number of communication client applications may be executing on a given switch. Client applications executing on a switch may assist in both communications to connected clients and configuration of hardware on the switch (e.g., ports of a line card, storage devices integrated within the switch). In some cases, client applications are referred to as “listeners,” in part, because they “listen” for a communication or command and then process what they receive. For high-availability switch 200A, an example client application is client 1 (230-1) which is illustrated to support communication from either the active or the standby controller to devices connected through Card Slot 1 (221). In some example implementations, a listener may be configured to automatically identify and route NVMe-oF network packets to support storage for remote devices (and applications executing on those remote devices). Other implementations, where the automatic identification is performed by hardware components or other software components, are also possible. Client applications executing on a switch may be implemented using software, firmware, hardware logic, or a combination thereof.
A second example client application in
Referring to
As also illustrated in example HA switch 200B, a line card may communicate with any number of integrated SSD components. Specifically, area 255 illustrates that SSD 3-1, SSD 3-2, and SSD 3-N (all referenced with element reference number 251) may be integrated with (or connected to) Card Slot 3 (253). In this example, client 2 (230-2) may adapt to communicate with line cards having integrated SSD components and other computing devices (e.g., outside of area 255) may not be aware of detailed implementations within area 255. That is, the disclosed implementation of SSD components integrated within HA switch 200B may be transparent to external devices and other components of HA switch 200B. Although client 2 (230-2) is illustrated in block diagram 200B as a potential software (or firmware) module, it is possible to implement functionality of client 2 (230-2) completely (or at least partially) within hardware logic (i.e., silicon based logic) of HA switch 200B. One of ordinary skill in the art, given the benefit of this disclosure, will recognize that many different implementations of software, firmware, and hardware logic may be used to achieve disclosed techniques of automatically provisioning communication flows for network attached storage devices (NVMe-oF devices in particular).
Referring now to
To facilitate lossless communication between each remote device (305-320) and an associated NVMe storage device (e.g., NVMe storage device 1 (335) and/or NVMe storage device 2 (340)), functional block 330 illustrates that network infrastructure device (switch/router) 301 may include functional modules (e.g., client applications as discussed above) to configure a virtual local area network (VLAN) or tunnel that allows each remote device to have access (perhaps dedicated access) to storage. Thus, upon determination that a remote device desires access to storage, network infrastructure device (switch/router) 301 may automatically provision a portion of available storage in support of the remote device.
Referring to
As illustrated in example view 400, connect/disconnect arrows 405 indicate that remote devices (305-320) may establish a network connection with network infrastructure device 401. Further, after communication is established between a remote device (e.g., remote devices 305-320), a discovery process may execute to identify configuration options for a remote device collectively working with other network devices. Portions of a discovery process may be initiated from either the remote device or network infrastructure device and may work with other local functional modules and remote functional modules to determine configuration options. Once determined, a client application executing on network infrastructure device 401 may automatically provision resources for use by the remote device. That is, a collective exchange of information may take place between a newly connected remote device and network infrastructure device 401. This collective exchange of information may be referred to as “discovery” and may include components supplying configuration information from both of the remote device and network infrastructure device 401. Once discovery is complete, a client application executing on network infrastructure device 401 may make available resources to satisfy the attributes of as provided by the configuration information. As illustrated in
Example view 400 also includes integrated components or external devices that may be connected to network infrastructure device 401 in accordance with one or more disclosed implementations. Specifically, one or more NVMe storage devices, as illustrated by NVMe storage device 1 (435) through NVMe storage device N (440) may be added to network infrastructure device 401. The bidirectional arrow between NVMe storage device 1 (435) and network infrastructure device 401 is intended to illustrate that the connection between network infrastructure device 401 and NVMe storage device 1 (435) is both bi-directional and that the connection may be transient in that NVMe storage device 1 (435) may be plugged into (or removed) from network infrastructure device 401 at run-time. Of course, it is more likely that resources will be added to network infrastructure device 401 at run-time and be made available through auto-provisioning rather than removing capabilities, Removal of capabilities may require additional actions to ensure no impact to remote devices that may be using said resources. Specifically, if removal of components is planned ahead of time, reliance on components to be removed may be reduced or eliminated such that their removal does not impact remote device functionality. In any case, as illustrated by configuration modules 415 management of resource pools and allocation to connected devices may be performed as part of the functionality of network infrastructure device 401. In addition to storage devices, example view 400 illustrates that line card 445 may be inserted or removed from network infrastructure device 401 at run-time. Thus, a line card (possibly including SSD capability) may be plugged into a running network infrastructure device 401 to increase its resource pools. This type of run-time augmentation may allow system administrators to add capabilities as part of an overall HCl solution.
Referring to
Continuing from block 515 where a remote host connects, block 520 indicates that a discovery function may execute on the network switch (and possibly have components execute on the remote device) to determine allocation information (e.g., auto provisioning of resource information) on behalf of the remote device (e.g., remote device 1 (305) of
Continuing from block 540 of example method 500 a remote host may disconnect from the network switch (or may simply relinquish resources that are no longer in-use). Block 545 indicates that released resources (e.g., storage capacity) may be returned to an allocation pool of resources as managed by functional modules executing on the network switch. Block 550 indicates that relinquished resources may be made available to satisfy additional requests made on behalf of this same remote host (e.g., remote device of
Continuing from block 560 as the third branch of example method 500, new capability may be added to a network switch. As explained above, this new capability may be added while the network switch is operational such that existing resources that are allocated to other remote devices remain available without interruption of services. New capability may be in form of additional capacity (e.g., storage or network ports) made available by inserting a line card into a network switch. Other possibilities exist for augmenting capability of a network switch. In general, block 560 represents an augmentation of resources for the network switch so that resources are increased at run-time. Block 565 indicates that a client application executing on the network switch may recognize the new line card, for example, and any resources available on that line card. Block 570 indicates that new augmented resources may be added to resource pools as managed by the network switch and used to satisfy further requests on behalf of current or additional remote devices (also referred to in this example as remote hosts). Again, flow returns from block 570 to block 510 where the network switch continues in operational mode.
Referring now to
A machine-readable storage medium, such as 602 of
Each of these networks can contain wired or wireless programmable devices and operate using any number of network protocols (e.g., TCP/IP) and connection technologies (e.g., WiFi® networks, or Bluetooth®. In another embodiment, customer network 702 represents an enterprise network that could include or be communicatively coupled to one or more local area networks (LANs), virtual networks, data centers and/or other remote networks (e.g., 708, 710). In the context of the present disclosure, customer network 702 may include one or more high-availability switches or network devices using methods and techniques such as those described above to automatically provision storage resources based on the NVMe-oF protocols.
As shown in
Network infrastructure 700 may also include other types of devices generally referred to as Internet of Things (IoT) (e.g., edge IOT device 705) that may be configured to send and receive information via a network to access cloud computing services or interact with a remote web browser application (e.g., to receive configuration information).
Network infrastructure 700 also includes cellular network 703 for use with mobile communication devices. Mobile cellular networks support mobile phones and many other types of mobile devices such as laptops etc. Mobile devices in network infrastructure 700 are illustrated as mobile phone 704D, laptop computer 704E, and tablet computer 704C. A mobile device such as mobile phone 704D may interact with one or more mobile provider networks as the mobile device moves, typically interacting with a plurality of mobile network towers 720, 730, and 740 for connecting to the cellular network 703.
In
As also shown in
Computing device 800 may also include communications interfaces 825, such as a network communication unit that could include a wired communication component and/or a wireless communications component, which may be communicatively coupled to processor 805. The network communication unit may utilize any of a variety of proprietary or standardized network protocols, such as Ethernet, TCP/IP, to name a few of many protocols, to effect communications between devices. Network communication units may also comprise one or more transceiver(s) that utilize the Ethernet, power line communication (PLC), WiFi, cellular, and/or other communication methods.
As illustrated in
Persons of ordinary skill in the art are aware that software programs may be developed, encoded, and compiled in a variety of computing languages for a variety of software platforms and/or operating systems and subsequently loaded and executed by processor 805. In one embodiment, the compiling process of the software program may transform program code written in a programming language to another computer language such that the processor 805 is able to execute the programming code, For example, the compiling process of the software program may generate an executable program that provides encoded instructions (e.g., machine code instructions) for processor 805 to accomplish specific, non-generic, particular computing functions.
After the compiling process, the encoded instructions may then be loaded as computer executable instructions or process steps to processor 805 from storage device 820, from memory 810, and/or embedded within processor 805 (e.g., via a cache or on-board ROM). Processor 805 may be configured to execute the stored instructions or process steps in order to perform instructions or process steps to transform the computing device into a non-generic, particular, specially programmed machine or apparatus. Stored data, e.g., data stored by a storage device 820, may be accessed by processor 805 during the execution of computer executable instructions or process steps to instruct one or more components within the computing device 800.
A user interface (e.g., output devices 815 and input devices 830) can include a display, positional input device (such as a mouse, touchpad, touchscreen, or the like), keyboard, or other forms of user input and output devices. The user interface components may be communicatively coupled to processor 805. When the output device is or includes a display, the display can be implemented in various ways, including by a liquid crystal display (LCD) or a cathode-ray tube (CRT) or light emitting diode (LED) display, such as an organic light emitting diode (OLED) display. Persons of ordinary skill in the art are aware that the computing device 800 may comprise other components well known in the art, such as sensors, powers sources, and/or analog-to-digital converters, not explicitly shown in
Certain terms have been used throughout this description and claims to refer to particular system components. As one skilled in the art will appreciate, different parties may refer to a component by different names. This document does not intend to distinguish between components that differ in name but not function. In this disclosure and claims, the terms “including” and “comprising” are used in an open-ended fashion, and thus should be interpreted to mean “including, but not limited to . . . . ” Also, the term “couple” or “couples” is intended to mean either an indirect or direct wired or wireless connection. Thus, if a first device couples to a second device, that connection may be through a direct connection or through an indirect connection via other devices and connections, The recitation “based on” is intended to mean “based at least in part on.” Therefore, if X is based on Y, X may be a function of Y and any number of other factors.
The above discussion is meant to be illustrative of the principles and various implementations of the present disclosure. Numerous variations and modifications will become apparent to those skilled in the art once the above disclosure is fully appreciated. It is intended that the following claims be interpreted to embrace all such variations and modifications.
Number | Name | Date | Kind |
---|---|---|---|
8424053 | Gottimukkala et al. | Apr 2013 | B2 |
8705342 | Yu et al. | Apr 2014 | B2 |
9491201 | Bagepalli et al. | Nov 2016 | B2 |
9634944 | Chinnaiah et al. | Apr 2017 | B2 |
9686203 | Decusatis et al. | Jun 2017 | B2 |
9692560 | Galon et al. | Jun 2017 | B1 |
9747249 | Cherian et al. | Aug 2017 | B2 |
9990139 | Nadakuditi et al. | Jun 2018 | B2 |
10572180 | Nemawarkar | Feb 2020 | B1 |
11042302 | Benisty | Jun 2021 | B2 |
20070100979 | Soland et al. | May 2007 | A1 |
20070297333 | Zuk et al. | Dec 2007 | A1 |
20080148270 | Gopisetty et al. | Jun 2008 | A1 |
20080256239 | Gilde et al. | Oct 2008 | A1 |
20110019669 | Ma et al. | Jan 2011 | A1 |
20150026794 | Zuk et al. | Jan 2015 | A1 |
20150248366 | Bergsten et al. | Sep 2015 | A1 |
20150370742 | Breakstone et al. | Dec 2015 | A1 |
20170177216 | Freyensee et al. | Jun 2017 | A1 |
20170177541 | Berman et al. | Jun 2017 | A1 |
20190004988 | Elkington et al. | Jan 2019 | A1 |
20190089639 | Dhanabalan | Mar 2019 | A1 |
20190102093 | Parnell | Apr 2019 | A1 |
20190114278 | Olarig et al. | Apr 2019 | A1 |
20200117525 | Kachare | Apr 2020 | A1 |
20200326868 | Yang | Oct 2020 | A1 |
20210111996 | Pismenny et al. | Apr 2021 | A1 |
Number | Date | Country |
---|---|---|
2016-528809 | Sep 2016 | JP |
10-2018-0134745 | Dec 2018 | KR |
2017176775 | Oct 2017 | WO |
Entry |
---|
Intel®, “Intel® Rack Scale Design (Intel® RSD) Pooled System Management Engine (PSME)”, User Guide Software v2.4, Revision 001, Apr. 2019, pp. 1-81. |
International Search Report and Written Opinion received for PCT Application No. PCT/US2019/038892, dated Mar. 23, 2020, 10 pages. |
International Search Report and Written Opinion received for PCT Application No. PCT/US2019/034550, dated Feb. 28, 2020, 10 pages. |
CISCO, “Best Practices for Oeployrnents Using DCB and RoCE,” White Paper, Jun. 20, 2015, 23 pages, https://www.roceinitiative.org/wp-content/uploads/2016/11/elx_wp_all_best-practices_deployments_dcb_roce_cisco.pdf. |
Hampel, D., “New Storage Infrastructure with Flash and NVMe Over Fabrics,” 2017, 56 pages, Brocade Communications Systems, Inc. |
ARISTA, “Deploying IP Storage Infrastructures,” ARISTA White Paper, 2014, https://solutions.arista.com/hubfs/Arista/White_Papers/Deploying_Storage_Net_WhitePaper.pdf. |
Cisco, “Cisco MDS 9000 Family Quality of Service,” Mar. 30, 2006, https://www.cisco.com/c/en/us/products/interfaces-modules/storage-networking-modules/index.html. |
Opportunities from Our Compute, Network, and Storage Inflection Points, (Web Page), Retrieved Dec. 24, 2018, 19 Pgs. |
Pavic, N., IBM Pure Systems, (Research Paper), Aug. 16, 2012, 18 Pgs. |
Number | Date | Country | |
---|---|---|---|
20200396126 A1 | Dec 2020 | US |