Various embodiments of the present disclosure are generally directed to the efficient and secure processing of data in a distributed network environment using a selected protocol, such as but not limited to NVMe (Non-Volatile Memory Express).
In some embodiments, a secure connection is established between a client device and a bridge device across an interface. A controller of the bridge device presents a unitary namespace as an available memory space to the client device. The controller further communicates with a plurality of downstream target devices to allocate individual namespaces within main memory stores of each of the target devices to form a consolidated namespace to support the unitary namespace presented to the controller. In this way, the bridge device can operate as an NVMe controller with respect to the client device for the unitary namespace, and as a virtual client device to each of the target devices which operate as embedded NVMe controllers for the individual namespaces.
These and other features which may characterize various embodiments can be understood in view of the following detailed discussion and the accompanying drawings.
The present disclosure is generally directed to systems and methods for performing data transfers in a secure and efficient manner.
Data storage devices store and retrieve computerized data in a fast and efficient manner. A data storage device usually includes a top level controller and a main memory store, such as a non-volatile memory (NVM), to store data associated with an client device. The NVM can take any number of forms, including but not limited to rotatable media and solid-state semiconductor memory.
Computer networks are arranged to interconnect various devices to enable data exchange operations. It is common to describe such exchange operations as being carried out between a client device and a data storage device. Examples of computer networks of interest with regard to the present disclosure include public and private cloud storage systems, local area networks, wide area networks, object storage systems, the Internet, cellular networks, satellite constellations, storage clusters, etc. While not required, these and other types of networks can be arranged in accordance with various industry specifications in order to specify the interface and operation of the interconnected devices.
One commonly utilized industry specification is referred to as Non-Volatile Memory Express (NVMe), which generally establishes NVMe domains (namespaces) to expedite parallel processing and enhance I/O throughput accesses to the NVM memory in the network. NVMe provides enhanced command processing, enabling up to 64K command queues each capable of accommodating up to 64K pending commands.
Another specification is referred to as Compute Express Link (CXL) which enhances high speed central processing unit (CPU) to device and CPU to memory data transfers. CXL enables efficiencies in I/O data transfers, caching and memory through the sharing of resources between the source and the target devices. Both NVMe and CXL are particularly suited to the use of Peripheral Computer Interface Express (PCIe) interfaces, although other types of interfaces can be used.
While operable, these and other techniques can present challenges when operating in various environments, including high volume and low trust environments. To this end, various embodiments of the present disclosure are generally directed to the use of one or more bridge devices to interface and manage the storage and retrieval of client data.
As explained below, some embodiments include operational steps such as coupling and establishing a secure connection between a client device and a data processing device across an interface. The data processing device is sometimes referred to as a bridge device (also a transport bridge device, or a transport and protocol bridge device). The bridge device may take a variety of forms including a data storage device. A controller of the bridge device establishes a transport bridge that presents the bridge device to the host device as a unitary target using a selected transport protocol. One particularly suitable protocol is the NVMe specification, although other protocols can be used.
Thereafter, the controller of the bridge device communicates with a plurality of downstream data storage devices to emulate the selected protocol and service access commands from the client device. In some embodiments, an emulated namespace, such as an NVMe namespace, is managed at the bridge device level, and the actual storage is carried out using individual target namespaces at the storage device level.
In this way, the system can be configured such that the bridge device presents the external network with an essentially conventionally behaving interface in accordance with the selected protocol. From this point downstream, the bridge device emulates a target which is made up of a number of downstream target devices (such as separate SSDs, etc.). While not necessarily required, RAID techniques may be used to generate parity and other operations, so that the data and parity values distributed across the various storage devices as desired in accordance with the associated RAID level. Each of these separate downstream storage devices may be treated as a separate namespace. Further embodiments can use CMB (controller memory buffer) and VM (virtual machine) techniques on the downstream side.
These and other features and advantages of various embodiments can be understood beginning with a review of
The client device 101 can take any number of desired forms including but not limited to a host device, a server, a RAID controller, a router, a network accessible device such as a tablet, smart phone, laptop, desktop, workstation, gaming system, other forms of user devices, etc. While not limiting, the client device 101 is contemplated as having at least one controller, which may include one or more hardware or programmable processors, as well as memory, interface electronics, software, firmware, etc. As described herein, programmable processors operate responsive to program instructions that are stored in memory and provide input instructions in a selected sequence to carry out various intended operations. Hardware processors utilize hardwired gate logic to perform the required logic operations.
The data storage device 102 can take any number of desired forms including a hard disc drive (HDD), a solid-state drive (SSD), a hybrid drive, an optical drive, a thumb drive, a network appliance, a mass storage device (including a storage enclosure having an array of data storage devices), etc. Regardless of form, the data storage device 102 is configured to store user data provided by the client device 101 and retrieve such data as required to authorized devices across the network, including but not limited to the initiating client device 101 that supplied the stored data.
The interface 103 provides wired or wireless communication between the respective client and storage devices 101, 102, and may involve local or remote interconnection between such devices in substantially any desired computational environment including local interconnection, a local area network, a wide area network, a private or public cloud computing environment, a server interconnection, the Internet, a satellite constellation, a data cluster, a data center, etc. While PCIe is contemplated as a suitable interface protocol for some or all of the interconnections between the respective devices 101/102, such is not necessarily required.
The data storage device 102 includes a main device controller 104 and a memory 106. The main device controller 104 can be configured as one or more hardware based controllers and/or one or more programmable processors that execute program instructions stored in an associated memory. The memory 106 can include volatile or non-volatile memory storage including flash, RAM, other forms of semiconductor memory, rotatable storage discs, etc. The memory can be arranged as a main store to store user data from the client device as well as various buffers, caches and other memory to store user data and other types of information to support data transfer and processing operations.
The namespace may constitute all of the available capacity of the NVM memory of the device (see e.g., 106 in
At this point it will be noted that the respective elements in
Each namespace is in turn managed by the bridge circuit 126 via an array of data storage devices 128 formed from individual storage devices 130 (denoted S1 through S4). Any number and types of storage devices can be used.
The bridge device 126 manages both transport bridge and transport protocol functions for the storage of data among the respective storage devices 130. As used herein, the term transport bridge will be understood to generally describe the processing capabilities of the bridge device to manage input access commands from the external network. The term transport protocol will be understood to generally describe the mechanisms employed by the bridge device 126 to translate between the conventions utilized by the external devices (e.g., clients 122) as compared to the mechanisms utilized by the internal devices downstream of the transport bridge (e.g., storage devices 130).
NVMe is a particularly suitable protocol that can be used in at least some embodiments of the present disclosure. In this case, NVMe can be run at multiple levels, including at the network level (e.g., as presented and interfaced upstream from the bridge device) and at the storage level (e.g., as presented and interfaced downstream from the bridge device). However, any number of different industry standard and proprietary protocols can be utilized to manage namespaces (e.g., allocated units of storage accessible by an authorized owner/client).
In the embodiment of
The memory 202 may incorporate several different types and configurations of semiconductor and/or disc-based memory. Aspects of the memory may be volatile or non-volatile. In the embodiment of
The FW 206 can represent program instructions executed by the controller 202 during operation. The cache 208 can include read data buffers to temporarily cache data being transferred to the requesting client from the consolidated namespace, and write caches used to temporarily cache data being transferred from the requesting client to the consolidated namespace.
The table 210 provides translation data to enable the bridge device 200 to coordinate the input commands and direct the requisite data transfers among the downstream data storage devices. The consolidated namespace presented to the upstream device(s) is managed internally at this level. One or more translation layers may be provided to coordinate and direct addresses to the respective downstream devices. In the environment of NVMe, a consolidated namespace at the client level may be broken up into multiple target NVMe namespaces that are in turn managed by the bridge device 200. In this way, each device 130 (
While not limiting, it is contemplated that RAID (redundant array of independent/inexpensive discs) techniques can be used to arrange and distribute the client data among the respective devices (
Accordingly, the RAID control block 212 in
For example, if a RAID-5 arrangement is used, then the input data from the client device can be broken into N total blocks of data made up of N-1 user data stripes plus 1 stripe of parity data. Each of the separate blocks can thereafter be written to the downstream storage devices (denoted at 214 in
Block 222 represents the corresponding allocations that may be made by the bridge device 200 to accommodate the client level namespace NS-1 in a first embodiment. In this example, an equal, corresponding portion 224 of the storage capacity of each of four drives (denoted as Drives A-D) is selected to provide a hidden, consolidated namespace. For example, if the NS-1 namespace covers 2 TB (1012 bytes) of data, then each of the subblocks 224 in block 222 will be nominally 500 MB (109 bytes) of storage. To the extent that the client device forwards data for storage to the namespace NS-1, the bridge device 200 will divide this so as to nominally distribute this data equally among the drives A-D. Each drive will accordingly have and process a local namespace (e.g., Drive A will have local namespace NS-A; Drive B will have a local namespace NS-B, and so on).
Adjustments may be made such that one device stores more data than another device at any given time, but overall workload will be distributed and level loaded by the bridge device 200 such that each of the drives will have nominally the same average workload.
The bridge device 200 may enact different protocols and security operations depending on the workload history of the client. Larger data transfers may be subjected to a first distribution profile, such as RAID-5 so that the input data are divided equally (along with parity) to the various drives. Smaller and/or faster turned/updated data sets may be simply stored directly to one of the devices (or a lower level of RAID processing may be supplied such as data mirroring via RAID-1, etc.). Accordingly, while the consolidated namespace 222 is shown to be nominally the same size as the allocated client-side namespace NS-1, it will be appreciated that the consolidated workspace may be some percentage larger to accommodate worst-case parity storage requirements. For example, if five (5) drives are enacted with RAID-5 capabilities, the overall allocated space may be upwards of 20% larger than the client level namespace NS-1, etc.
Block 226 shows another, alternative arrangement of a consolidated namespace that can be provisioned to accommodate the client level namespace NS-1. In block 226, each of four drives (A-D) are allocated different respective amounts of storage capacity 228 to make up the necessary storage space. This can be carried out for a number of reasons, including other workloads being managed by the bridge device 200. In this example, Drive A is provided with a first namespace NS-A that is significantly larger than a namespace NS-B in Drive B, etc. As before, the bridge device 200 operates to distribute the data from the client among these respective namespaces as appropriate.
Device A 252 includes a device controller (processor) 256, a device cache 258, a device NVM (non-volatile memory) 260 and a CMB (control memory buffer) controller 262. Device B 254 similarly includes a device controller (processor) 266, cache 268, NVM 270 and CMB memory space 272. It is contemplated albeit not necessarily required that the respective devices 252, 254 are otherwise nominally identical data storage devices, such as solid-state drives (SSDs), albeit with different levels of functionality such as provisioned via different FW and command structures.
The processors 256, 266 are contemplated as programmable processors to provide various command and control functions. The caches 258, 268 provide temporary processing and storage of data during transfers. The NVMs 260, 270 are main memory storage locations, such as flash memory.
The CMB controller 262 operates in some embodiments in accordance with the CXL specification to implement direct access and control of the CMB 272. As noted above, this allows the bridge storage device 252 to access and control the CMB 272 directly, as if the CMB were a physical part of the bridge storage device. This can be particularly efficient if the CMB controller 262 consolidates a CMB as a local memory from each of the target devices to provide a larger combined CMB memory that can be directly accessed by the CMB controller and that spans each of the target devices.
To explain this operation more fully,
The front end and back end controllers 282, 290 may include hardware and/or programmable processors, with the front end controller 282 handling commands and other communications with an upstream device (e.g., bridge device) and the back end controller 290 handling transfers to and from the flash memory 292.
The respective write cache 284. CM B 286 and read buffer 288 can be volatile or non-volatile memory including RAM, flash, FeRAM, STRAM, RRAM, phase change RAM, disc media cache, etc. An SOC (system on chip integrated circuit) approach can be used so that the respective caches are internal memory within a larger integrated circuit package that also incorporates the associated controllers. Alternatively, the caches may be separate memory devices accessible by the respective controllers.
The CMB 288 may be available memory that is specifically allocated as needed, and is otherwise used for another purpose (e.g., storage of map metadata, readback data, etc.). In one non-limiting example, the write cache 284 is non-volatile flash memory to provide non-volatile storage of pending write data, and the CMB 286 and read buffer 288 are formed from available capacity in one or more DRAM devices.
While not limiting, it is contemplated that the flash is NAND flash and stores user data from the client device in the form of pages. A total integer number N of data blocks may make up each page, with each data block storing some amount of user data (e.g., 4096 bytes, etc.) plus some number of additional bytes of error correction codes (ECC), such as LDPC (low density parity check). Additional processing supplied to the data stored to the flash may include the generation of parity sets, outer codes, run-length limited encoding, encryption, and RAID processing for data sets distributed by the bridge circuit across multiple namespaces. As such, it may be desirable in some embodiments to perform some of these functions at the bridge device level using the CMB controller 262 and CMB 272 in
It is contemplated that the system will operate to provide secure transfer operations. As part of this, the bridge device will authenticate each of the downstream target devices to provide a trust boundary in which these devices operate. In addition, authentication steps may be carried out to authenticate each client device that establishes a namespace and accesses the bridge device. To this end,
In
A bridge device 304 may initiate the authentication process such as by requesting an encrypted challenge string from a selected target device 306. This may include an initial value which is then encrypted by the drive, or some other sequence may be employed. The challenge value may be forwarded to the TSI 302, which processes the challenge value in some way to provide an encrypted response, which may be processed by the bridge and the target. In this way, the bridge and the target are authenticated to each other as well as to the TSI authority (thereby establishing a trust boundary as in
Similar steps can be carried out for each of the other target devices in the array, as well as each client that is granted access to the system. It will be noted that other authentication schemes can be carried out, including schemes that rely on local information among the bridge and targets to provide local authentication.
Block 322 shows an initial processing operation in which various elements of the system are coupled and authenticated. As described above this can include physical and electrical interconnection of various devices, either locally or over a network. The bridge and target devices may be separately authenticated as a unit, followed by an authentication of a client device to interface and perform data transfers with the unit.
Block 324 shows the selection of a suitable protocol for the operation of the system. In some embodiments, this can include designation of the NVMe specification or some other suitable protocol. Block 326 shows a resulting configuration of the bridge device and block 328 represents the formulation and presentation of a single namespace to the client. It will be appreciated that the respective steps 324 may be initiated by a request from the client for a selected capacity of memory, which in turn is configured by the bridge device through the allocation of associated memory in the applicable target devices. Some measure of parameters and data exchanges may take place at this point. It will be noted that, in at least some embodiments, from the client device perspective the processing is normal so as to be indistinguishable from a conventional allocation of a namespace to carry out desired processing. Similarly, at the target device level, similar processing may take place to select and configure individual namespaces based on inputs supplied by the bridge device. However, at the bridge device level, consolidation and emulation of the requested memory and performance requirements from the client are translated and configured so as to break the emulated namespace into individual namespaces that are consolidated into the overall space.
As a result, the bridge device can make intelligent decisions to select the number of devices, the amount of space to be allocated from each device, additional layers of processing (including RAID techniques, encryption, etc.) to be applied, and so on. The operation of the bridge device is thus hidden and not apparent to either the client or target devices (e.g., the client is unaware the namespace is emulated, and the target devices are unaware the individual namespaces being generated and managed at the device level form part of a larger consolidated namespace). In some cases, multiple namespaces that make up the consolidated namespace may be partially or fully on the same target device.
Continuing with
For example, the bridge device may allocate additional devices/memory space for specific operations to maintain a selected level of performance/service. By atomizing the namespaces at the individual target device level, the bridge device can seamlessly make adjustments including releasing some namespaces and substituting others while presenting the same namespace to the client device. Other operational advantages will readily occur to the skilled artisan in view of the present disclosure.
The foregoing description of various embodiments can be carried out using any number of different types of processing device and storage device configurations. The bridge devices can take any number of suitable forms, including servers, control boards, storage devices, etc. In some embodiments, data storage devices configured as solid-state drives (SSDs) are particularly suited to carry out the functionality described herein at the respective target and/or bridge levels. To this end,
The SSD 410 includes a controller circuit 412 that generally corresponds to the controller 104 of
Each controller 414, 416, 418 includes a separate programmable processor with associated programming (e.g., firmware, FW) in a suitable memory location, as well as various hardware elements to execute data management and transfer functions. This is merely illustrative of one embodiment; in other embodiments, a single programmable processor (or less/more than three programmable processors) can be configured to carry out each of the front end, core and back end processes using associated FW in a suitable memory location. Multiple programmable processors can be used in each of these operative units. A pure hardware based controller configuration, or a hybrid hardware/programmable processor arrangement can alternatively be used. The various controllers may be integrated into a single system on chip (SOC) integrated circuit device, or may be distributed among various discrete devices as required.
A controller memory 420 represents various forms of volatile and/or non-volatile memory (e.g., SRAM, DDR DRAM, flash, etc.) utilized as local memory by the controller 412. Various data structures and data sets may be stored by the memory including one or more metadata map structures 422, one or more sets of cached data 424, and one or more CMBs 426. Other types of data sets can be stored in the memory 420 as well.
A transport bridge and protocol manager circuit 430 can be provided in some embodiments, particularly for cases where the SSD 410 is configured as a bridge and protocol transport. The circuit 430 can be a standalone circuit or can be incorporated into one or more of the programmable processors of the various controllers 414, 416, 418.
A device management module (DMM) 432 supports back end processing operations. The DMM 432 includes an outer code engine circuit 434 to generate outer code, a device OF logic circuit 436 to provide data communications, and a low density parity check (LDPC) circuit 438 configured to generate LDPC codes as part of an error detection and correction strategy used to protect the data stored by the by SSD 410. One or more XOR buffers 440 are additionally incorporated to temporarily store and accumulate parity data during data transfer operations.
The memory module 114 of
It can be seen that the functionality described herein is particularly suitable for SSDs in an NVMe and/or CXL environment, although other operational applications can be used. In some cases, the diagram of
This arrangement can be understood with a review of
The bridge device 502 includes a controller 508 that operates as described above as a client OF NVMe Controller for a Unitary Namespace, namely, the namespace presented to the client device as illustrated including in
The first target device 504 (“Target 1”) has a controller 512 as described above that also operates as a Bridge OF NVMe Controller for a Target Namespace at the device level. The namespace incorporates some or all of the associated NVM 514. In this way, the bridge device 502 operates, as far as the target device 504 is concerned, as the client. Similar, independent operation is carried out for the second target device 506 (“Target 2”), which has respective controller 516 (operative as a Bridge OF NVMe Controller for a Target Namespace) for the associated namespace formed from some or all of the capacity of NVM 518. Hence, the bridge device 502 operates, at least as far as the second target device 506 is concerned, as the client device. Separate communication pathways 520, 522 are supplied to enable parallel operation and data transfers.
Each of the target devices 504, 506 further have an optional CMB 524, 526 as local memory that can be allocated and controlled by the bridge device 502. These respective CMBs form a larger, consolidated memory that spans the various target devices, further enabling data transfers to take place between the bridge device and the target devices.
Finally,
It is contemplated that the unitary namespace 532 will have a first selected capacity, and the consolidated namespace 534 will have a second, larger capacity. The additional capacity enhances processing and transfer efficiencies and allows the bridge device 502 to flexibility to store and distribute the data as required (e.g., RAID-5 with parity, RAID-1 with mirroring, etc.). The bridge device 502 can further make adjustments on-the-fly to the consolidated namespace by adding or dropping individual namespaces to meet ongoing needs of the system in a manner that is wholly transparent to the client.
In this way, the bridge device can operate as an NVMe controller with respect to the client device for the unitary namespace, and as a virtual client device to each of the target devices which operate as embedded NVMe controllers for the individual namespaces.
It is to be understood that even though numerous characteristics and advantages of various embodiments of the present disclosure have been set forth in the foregoing description, together with details of the structure and function of various embodiments of the disclosure, this detailed description is illustrative only, and changes may be made in detail, especially in matters of structure and arrangements of parts within the principles of the present disclosure to the full extent indicated by the broad general meaning of the terms in which the appended claims are expressed.