The invention generally relates to field of storage controllers.
Non-Volatile Dynamic Random Access Memory (NVDRAM) is a combination of volatile memory (e.g., Dynamic Random Access Memory (DRAM)) and non-volatile memory (e.g., FLASH), manufactured on a single device. The non-volatile memory acts as a shadow memory such that data stored in the volatile memory is also stored in the non-volatile memory. When power is removed from the device, the data stored in the non-volatile portion of the NVDRAM remains even though the data stored in the DRAM is lost. When NVDRAM is used in a Host Bus Adapter (HBA), the volatile portion of the NVDRAM is mapped to the address space of the host system and the host system can directly access this volatile memory without going through a storage protocol stack. This provides a low latency interface to the HBA. However, the capacity of the DRAM is often many times smaller than the capacity of the SSD, due to power consumption limitations, the usable memory-mapped address space of the HBA to the host system, the available die area of the NVDRAM device, etc. Thus problems arise in how to efficiently allow a host system of the HBA to have a low latency access to the larger SSD, while bypassing the storage protocol stack in the host system.
Processing of block Input/Output (I/O) requests for a HBA that includes NVDRAM is performed, in part, by dynamically loading parts of a larger sized SSD of the HBA into a smaller sized DRAM of the HBA. The DRAM operates as a low latency, high speed cache for the SSD and is mapped to host system address space. One embodiment is an apparatus that includes a host system and a HBA. The host system includes a host processor and a host memory. The HBA includes a SSD and DRAM. The DRAM is operable to cache regions of the SSD. The host processor is operable to identify a block I/O read request for the SSD, to identify a region of the SSD that corresponds to the block I/O read request, and to determine if the region of the SSD is cached in the DRAM of the HBA. The host processor, responsive to determining that the region of the SSD is cached in the DRAM of the HBA, is further operable to direct the HBA to perform a memory copy of a block of data for the block I/O read request from the cached region of the SSD to the host memory, and to respond to the block I/O read request for the SSD utilizing the block of data in the host memory. The host processor, responsive to determining that the region of the SSD is not cached by the DRAM of the HBA, is further operable to direct the HBA to cache the region of the SSD in the DRAM of the HBA, to direct the HBA to perform a memory copy of the block of data for the block I/O read request from the cached region of the SSD to the host memory, and to respond to the block I/O read request for the SSD utilizing the block of data in the host memory.
The various embodiments disclosed herein may be implemented in a variety of ways as a matter of design choice. For example, some embodiments herein are implemented in hardware whereas other embodiments may include processes that are operable to construct and/or operate the hardware. Other exemplary embodiments are described below.
Some embodiments of the present invention are now described, by way of example only, and with reference to the accompanying drawings. The same reference number represents the same element or the same type of element on all drawings.
The figures and the following description illustrate specific exemplary embodiments of the invention. It will thus be appreciated that those skilled in the art will be able to devise various arrangements that, although not explicitly described or shown herein, embody the principles of the invention and are included within the scope of the invention. Furthermore, any examples described herein are intended to aid in understanding the principles of the invention and are to be construed as being without limitation to such specifically recited examples and conditions. As a result, the invention is not limited to the specific embodiments or examples described below.
CPU 104 is communicatively coupled to HBA 112 and maps DRAM 114 into an address space of OS 110 to allow applications of OS 110 to directly access cached data of SSD 116. For example, OS 110 comprises applications that are used by the host system 102 to perform a variety of operations. Some of those applications may be used to access data stored in SSD 116, which might be cached within DRAM 114 of HBA 112. And, DRAM 114 is mapped into an address space of host system 102 and thus any cached data can be directly copied from DRAM 114 to an application buffer in host RAM 106.
Typically, there are limitations as to the size of DRAM 114 as compared to the size of SSD 116. For instance, the size of DRAM 114 may be limited by power considerations, by cost considerations, by architecture limitations of host system 102 that may not allow HBA 112 to map large address spaces into OS 110. However, applications can often utilize the large capacity storage that is available through the ever-increasing storage capacities of SSDs, such as SSD 116. Thus, problems exist as how to allow applications to access the large storage capacity of an SSD while maintaining the benefit of low latency access to HBA DRAM 114.
In this embodiment, HBA 112 is able to allow the applications to use the larger sized SSD 116 while caching SSD 116 within the smaller sized DRAM 114 by dynamically loading regions of SSD 116 into and out of DRAM 114 as needed. In some cases, it is not possible to map the entirety of SSD 116 into the address space of OS 110, due to limitations in host system 102. For instance, if HBA 112 utilizes a Peripheral Component Interconnect (PCI) or PCI express (PCIe) interface, then the usable address space of HBA 112 may be limited to 4 GigaBytes (GB). However, it is typical for the size of SSD 116 to be much larger than 4 GB.
In the embodiments described, SSD 116 is segmented into a plurality of regions, with each region being able to fit within DRAM 114. For instance, if DRAM 114 is 1 GB and SSD 116 is 100 GB, then SSD 116 may be segmented into a plurality of 4 MegaByte (MB) regions, with controller 118 able to copy regions of SSD 116 into and out of DRAM 114 as needed to respond to I/O requests generated by host system 102 (e.g., by I/O requests generated by applications executing within the environment of OS 110). In the embodiments described, the I/O requests are block I/O requests for SSD 116, which is registered as a block device for host system 102 and/or OS 110. The block I/O requests may bypass the typical protocol stack found in OS 110 to allow for a high speed, low latency interface to SSD 116. This improves the performance of HBA 112.
During operation of host system 102, OS 110 may register SSD 116 as a block device for I/O operations. As a block device, SSD 116 is read/write accessible by OS 110 in block-sized chucks. For instance, if the smallest unit of data that can be read from, or written to, SSD 116 is 4 KiloBytes (KB), then OS 110 may register SSD 116 as a 4 KB block device to allow applications and/or services executing within OS 110 to read data from, or write data to, SSD 116 in 4 KB sized chunks of data. A simple example of a block device driver for a registered block device may accept parameters that include a starting block number of the device to read from and the number of blocks to read from the device. As applications and/or services executing within OS 110 generate block I/O read requests for SSD 116 (e.g., the applications and/or services are requesting that data be read from SSD 116), CPU 104 of host system 102 monitors this activity and identifies the read requests (see step 202 of
In response to identifying the block I/O read request, CPU 104 identifies a region of SSD 116 that corresponds to the block I/O read request (see step 204 of
In response to identifying a region of SSD 116 that corresponds to the block I/O read request, CPU 104 determines if the region is cached in DRAM 114 (see step 206 of
One advantage of responding to the block I/O read request using data cached in DRAM 114 is reduced latency. Typically it is much faster to perform a memory copy or DMA transfer from DRAM 114 to host RAM 106 than it is to wait for SSD 116 to return the data requested by the block I/O read request. Although SSDs in general have lower latencies than rotational media such as Hard Disk Drives, SSDs are still considerably slower than DRAM. Thus, it is desirable to cache SSD 116 in DRAM 114, if possible. However, differences in the size between DRAM 114 and SSD 116 typically preclude the possibility of caching SSD 116 entirely in DRAM 114.
If the region that corresponds to the block I/O read request is not cached in DRAM 114, then CPU 104 directs controller 118 to cache the region of SSD 116 in DRAM 114 (see step 212 of
In response to caching the region from SSD 116 to DRAM 114, CPU 104 directs controller 118 to copy the block of data for the block I/O read request from the cached region(s) in DRAM 114 to host RAM 106 (see step 208 of
In some cases, applications and/or services executing within OS 110 may generate block I/O write requests for SSD 116 (e.g., the applications and/or services are attempting to modify data stored on SSD 116).
During operation of host system 102, applications and/or services executing within OS 110 generate block I/O write requests for SSD 116. CPU 104 of host system 102 monitors this activity and identifies the write requests (see step 302 of
In response to identifying the block I/O write request, CPU 104 identifies a region of SSD 116 that corresponds to the block I/O write request (see step 304 of
In response to identifying a region of SSD 116 that corresponds to the block I/O write request, CPU 104 determines if the region is cached in DRAM 114 (see step 306 of
One advantage of responding to the block I/O write request by writing data to DRAM 114 rather than SSD 116 is reduced latency. Typically it is much faster to perform a memory copy or DMA transfer from host RAM 106 to DRAM 114 than it is to wait for SSD 116 to finish a write operation for the new data. Although SSDs in general have lower latencies than rotational media such as Hard Disk Drives, writing to SSDs is still considerably slower than writing to DRAM. Thus, it is desirable to cache write requests for SSD 116 in DRAM 114, if possible. However, differences in the size between DRAM 114 and SSD 116 typically preclude the possibility of caching SSD 116 entirely in DRAM 114.
If the region that corresponds to the block I/O write request is not cached in DRAM 114, then CPU 104 directs controller 118 to cache the region of SSD 116 in DRAM 114 (see step 310 of
As discussed previously, in some cases CPU 104 may execute a service and/or a device I/O layer within OS 110, which is illustrated in
If the region is cached and the block I/O request is a write request (e.g., application 408 is writing to a file stored on SSD 416), then NVRAMDISK layer 412 copies the data from host RAM 404 to DRAM 414. NVRAMDISK layer 412 may then mark the region as dirty and at some point, flush the dirty data stored by DRAM 414 back to SSD 416. This ensures that SSD 416 stores the most up-to-date copy of data written to cached regions in DRAM 414.
If the region is not cached in either case of a file write or file read, then NVRAMDISK layer 412 copies the region from SSD 416 into DRAM 414. NVRAMDISK layer 412 may then operate as discussed above to perform the file read/write operations generated by application 408.
The invention can take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment containing both hardware and software elements. In a preferred embodiment, the invention is implemented in software, which includes but is not limited to firmware, resident software, microcode, etc.
Furthermore, the invention can take the form of a computer program product accessible from a computer-usable or computer-readable medium 606 providing program code for use by or in connection with a computer or any instruction execution system. For the purposes of this description, a computer-usable or computer readable medium 606 can be any apparatus that can contain, store, communicate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device.
The medium 606 can be an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system (or apparatus or device). Examples of a computer-readable medium 606 include a semiconductor or solid state memory, magnetic tape, a removable computer diskette, a random access memory (RAM), a read-only memory (ROM), a rigid magnetic disk and an optical disk. Current examples of optical disks include compact disk-read only memory (CD-ROM), compact disk-read/write (CD-R/W) and DVD.
A data processing system suitable for storing and/or executing program code will include at least one processor 602 coupled directly or indirectly to memory elements 608 through a system bus 610. The memory elements 608 can include local memory employed during actual execution of the program code, bulk storage, and cache memories which provide temporary storage of at least some program code in order to reduce the number of times code is retrieved from bulk storage during execution.
Input/output or I/O devices 604 (including but not limited to keyboards, displays, pointing devices, etc.) can be coupled to the system either directly or through intervening I/O controllers.
Network adapters may also be coupled to the system to enable the data processing system to become coupled to other data processing systems, such as through host systems interfaces 612, or remote printers or storage devices through intervening private or public networks. Modems, cable modem and Ethernet cards are just a few of the currently available types of network adapters.