A solid state drive (SSD) is a high performance storage device that contains no moving parts. SSDs are much faster than typical hard disk drives (HDD) with conventional rotating magnetic media, and typically include a controller to manage data storage. The controller manages operations of the SSD, including data storage and access as well as communication between the SSD and a host device. Since SSDs are significantly faster than their predecessor HDD counterparts, computing tasks which were formerly I/O (Input/output) bound (limited by the speed with which non-volatile storage could be accomplished) may find the computing bottleneck limited by the speed with which a host can queue requests for I/O. Accordingly, host protocols such as PCIe® (Peripheral Component Interconnect Express, or PCI Express®) purport to better accommodate this new generation of non-volatile storage.
The foregoing and other objects, features and advantages of the invention will be apparent from the following description of particular embodiments of the invention, as illustrated in the accompanying drawings in which like reference characters refer to the same parts throughout the different views. The drawings are not necessarily to scale, emphasis instead being placed upon illustrating the principles of the invention.
An SSD controller operates as an interface device conversant in a host protocol and a storage protocol supporting respective host and storage interfaces for providing a host with a view of a storage device. The host has visibility of the storage protocol that presents the storage device as a logical device, and accesses the storage device through the host protocol which is well adapted for accessing high speed devices such as solid state drives (SSDs). Since the host is presented with a storage device interface, while the storage protocol supports a plurality of devices, the storage interface may include multiple devices, ranging up to an entire storage array. The storage protocol supports a variety of possible dissimilar devices, allowing the host effective access to a combination of SSD and traditional storage as defined by the storage device. The individual storage devices are connected directly to the storage system which is being exposed as a single NVMe device to the host (current NVMe specifications are available at nvmexpress.org). In this manner, a host protocol such as NVMe (Non-Volatile Memory Express), well suited to SSDs, permits efficient access to a storage device, such as a storage array or other arrangement of similar or dissimilar storage entities, thus the entire storage system (storage array, network, or other suitable configuration) is presented to an upstream host as an NVMe storage device.
In contrast to conventional NVMe devices, which present a single SSD to a host, the approach disclosed herein “reverses” an NVMe interface such that the interface “talks” into a group, set or system of storage elements making the system appear from the outside as an SSD. The resulting interface presents as a direct-attached PCIe storage device that has an NVMe interface to the host, but has the entire storage system behind it, thus defining a type of NVMe Direct Attached Storage device (NDAS).
Configurations herein propose a NVMe direct attached storage (NDAS) system by exposing one or more interface(s) that perform emulation of a NVMe target register interface to an upstream host or an initiator, particularly with PCIe® (Peripheral Component Interconnect Express, or PCI Express®). The NDAS system allows flexibility in abstracting various and possibly dissimilar storage devices which can include SATA (serial Advanced Technology Attachment, current specifications available at sata-io.org) HDDs (hard disk drives), SATA SSDs and PCIe/NVMe SSDs with NAND or other types of non-volatile memory. The storage devices within the NDAS system could then be used to implement various storage optimizations, such as aggregation, caching and tiering.
By way of background, NVMe is a scalable host controller interface designed to address the needs of enterprise, data center and client systems that may employ solid state drives. NVMe is typically employed as an SSD device interface for presenting a storage entity interface to a host. Configurations herein define a storage subsystem interface for an entire storage solution (system), but which appears as an SSD by presenting a SSD storage interface upstream. NVMe is based on a paired submission and completion queue mechanism. Commands are placed by host software into the submission queue. Completions are placed into an associated completion queue by the controller. Multiple submission queues may utilize the same completion queue. The submission and completion queues are allocated in host memory.
PCIe is a high-speed serial computer expansion bus standard designed to replace older PCI, PCI-X, and AGP bus standards. PCIe implements improvements over the aforementioned bus standards, including higher maximum system bus throughput, lower I/O pin count and smaller physical footprint, better performance-scaling for bus devices, and a more detailed error detection and reporting mechanism. NVM Express defines an optimized register interface, command set and feature set for PCI Express-based solid-state drives (SSDs), and is positioned to utilize the potential of PCIe SSDs, and standardize the PCIe SSD interface.
A notable difference between PCIe bus and the older PCI is the bus topology. PCI uses a shared parallel bus architecture, where the PCI host and all connected devices share a common set of address/data/control lines. In contrast, PCIe is based on point-to-point topology, with separate serial links connecting every device to the root complex (host). Due to its shared bus topology, access to the older PCI bus is typically arbitrated (in the case of multiple masters), and limited to one master at a time, in a single direction. Also, the older PCI clocking scheme limits the bus clock to the slowest peripheral on the bus (regardless of the devices involved in the bus transaction). In contrast, a PCIe bus link supports full-duplex communication between any two endpoints, and therefore promotes concurrent access across multiple endpoints.
Configurations herein are based on the observation that current host protocols, such as NVMe, for interacting with mass storage or non-volatile storage, tend to be focused on a particular storage device or type of device and may not be well suited to accessing a range of devices. Unfortunately, conventional approaches to host protocols do not lend sufficient flexibility to the arrangement of mass storage devices servicing the host. For example, most personal and/or portable computing devices employ a primary mass storage device, and usually this is vendor matched with the particular device. For example, most off-the-shelf laptops, smartphones, and audio devices are shipped with a single storage device selected and packaged by the vendor. Conventional devices may not be focused on access to other devices because such access deviates from an expected usage pattern.
Accordingly, configurations herein substantially overcome the above described shortcomings by providing an interface device, or bridge, that exposes a host-based protocol (host protocol), such as NVMe to a user computing device, and employs a storage-based protocol (storage protocol) for implementing the storage and retrieval requests, thus broadening the sphere of available devices to those recognized under the storage protocol. For example, NDAS (Network Direct Attached Storage) allows a variety of different storage devices to be interconnected and accessed via a common bus by accommodating different storage mediums (SSD, HDD, Optical) and device types (i.e. differing capacities) across the common bus. All users or systems on the network can directly control, use and share the interconnected storage devices. In this manner, the host based protocol presents an individual storage device to a user, and a mapper correlates requests via the host protocol to a plurality of storage elements (i.e. individual drives or other devices) via the storage protocol, thus allowing the plurality of interconnected devices (sometimes referred to as a “storage array” or “disk farm”) to satisfy the requests even though the user device “sees” only a single device under the host protocol.
For example, NVMe facilitates access for SSDs by implementing a plurality of parallel queues for avoiding I/O bottlenecks and efficiently processing of requests stemming from multiple originators. Conventional HDDs are typically expected to encounter an I/O bound implementation, since computed results are likely to be generated faster than conventional HDDs can write them. NVMe is intended to lend itself well to SSDs (over conventional HDDs) by efficiently managing the increased rate with which I/O requests may be satisfied.
Depicted below is an example configuration of a computing and storage environment having an example configuration according to the system, methods and apparatus disclosed herein. A host computing device (host) interfaces with multiple networked storage devices using the storage interface device (interface device). The disclosed arrangement is an example, and other interconnections and configurations may be employed with the interface device, some of which are depicted further in
Referring to
The example of
In the example of
The host protocol 114 defines a plurality of host queues 117, including submission and completion queues, for storing commands and payload based on the requests 116 pending transmission to the interface device 150. The mapper 140 maintains a mapping 132 to transfer queues 130 defined in the local memory 164 on the NDAS side for transferring and buffering the data before writing the data to a storage element 144-3 according to the storage protocol 124, shown as example arrow 134.
The interface device 150, therefore, includes a host interface responsive to requests issued by a host 110, such that the host interface presents a storage device for access by the host 110. The storage protocol 124 defines all of the plurality of storage elements 142 as a single logical storage volume. In the device 150, a storage interface couples to a plurality of dissimilar storage devices, such that the plurality of storage devices are conversant in a storage protocol common to each of the plurality of storage devices. The storage protocol coalesces logical and physical differences between the individual storage elements so that the storage protocol can present a common, unified interface to the host 110. The mapper 140 connects between the host interface and the storage interface and is configured to map requests 116 received on the host interface to a specific storage element 144 connected to the storage interface, such that the mapped request 116 is indicative of the specific storage element based on the storage protocol, and the specific storage element 144 is independent of the presented storage device so that the host protocol need not specify any parameters concerning which storage element to employ.
The interface device 150 includes FIFO transfer logic in the mapper 140, in which the FIFO transfer logic is for mapping requests received on the host interface to a specific storage element 144 connected to the storage interface, and such that the mapped request is indicative of the specific storage element 144 based on the storage protocol 124. The host interface presents a single logical storage device corresponding to the plurality of storage elements, and each of the dissimilar storage elements is responsive to the storage protocol for fulfilling the issued requests.
In the example configuration, employing NVMe as the host protocol, NVMe provides an interface to a plurality of host queues 117, such that the host queues further include submission queues and completion queues, and in which the submission queues are for storing pending requests and a corresponding payload, and the completion queues indicate completion of the requests. The submission queues further include command entries and payload entries. A plurality of queues is employed because the speed of SSDs would be compromised by a conventional, single dimensional (FIFO) queue structure, since each request would be held up waiting for a predecessor request to complete. Submission and completion queues allow concurrent queuing and handling of multiple requests so that larger and/or slower requests do not impede other requests 116.
In the case of NVMe as the host protocol, the usage of the queues further comprising an interface to the shadow memory, defined in
On the storage protocol side, each of the storage elements 144 may be any suitable physical storage device, such as SSDs, HDDs, optical (DVD/CD), or flash/NAND, and may be a hub or gateway to other devices, thus forming a hierarchy (discussed further below in
In
In the example arrangement, in addition to the submission and completion queues defined by the NVMe protocol, the simplified NVMe protocol in the backend logic 124′ includes direct mapped locations for data buffers for each command in a particular submission queue 117. The interface device may take any suitable physical configuration, such as within an SSD, as an card in a host or storage array device, or as a standalone device, and may include a microcontroller/processor. Alternatively, the interface device 150 may not require an on-board processor, but rather its functions are either HW automated or controlled by the NDAS driver/SW. The upstream host 110 system uses NVMe driver for communicating with the NVMe NDAS system. The NDAS system would load a custom driver for the simplified NVMe protocol and would run a custom software application for controlling the functionality of the interface card 150 and responds to the NVMe commands being issued by the host/initiator 110 and manages all the downstream storage devices 144 as well.
The host protocol 114 is a point-to-point protocol for mapping the requests 116 from the plurality of host queues 117 to a storage element 144, and the storage protocol is responsive to the host protocol 114 for identifying a storage element 144 for satisfying the request, the host protocol referring only to the request and unaware of the storage element handling the request. Accordingly, each of the host queues corresponds to a point-to-point link between the host and the common storage entity. The completion queues are responsive to the host protocol for identifying completed requests based on the host protocol, the host protocol for mapping requests to a corresponding completion entry in the completion queues.
In the example configuration of
Those skilled in the art should readily appreciate that the programs and methods defined herein are deliverable to a user processing and rendering device in many forms, including but not limited to a) information permanently stored on non-writeable storage media such as ROM devices, b) information alterably stored on writeable non-transitory storage media such as floppy disks, magnetic tapes, CDs, RAM devices, and other magnetic and optical media, or c) information conveyed to a computer through communication media, as in an electronic network such as the Internet or telephone modem lines. The operations and methods may be implemented in a software executable object or as a set of encoded instructions for execution by a processor responsive to the instructions. Alternatively, the operations and methods disclosed herein may be embodied in whole or in part using hardware components, such as Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs), state machines, controllers or other hardware components or devices, or a combination of hardware, software, and firmware components.
While the system and methods defined herein have been particularly shown and described with references to embodiments thereof, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the scope of the invention encompassed by the appended claims.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/US2013/071842 | 11/26/2013 | WO | 00 |