Embodiments described herein generally relate to an apparatus and method for persisting blocks of data and metadata in a non-volatile memory (NVM) cache.
In current host side caching solutions, caching devices, such as Dynamic Random Access Memory (DRAM) devices and Solid State Disks (SSDs), can cache data for a storage array, such as an array of hard disk drives. Caching systems maintain modified data in the cache by mapping logical block addresses in the storage device to cache lines, writing data in the cache lines to a second non-volatile cache, such as an SSD, writing the metadata to the second non-volatile cache for blocks written to the second non-volatile cache, and then returning complete after both the block of data and metadata are written to the second non-volatile cache. Writing user blocks and metadata in cache to the second non-volatile cache is required for a write-back cache when dirty data is inserted into the cache to provide for recovery from a power failure. In the event of a power failure, a recovery/rebuild procedure can recover dirty user data from the second non-volatile cache and write it to the primary storage device, since cache metadata is available on the second non-volatile cache. The metadata provides the critical mapping information on the status of the data in the cache line, such as whether the cache line is clean or dirty. Typically, the metadata is located at the beginning or the end of the second non-volatile cache, such as an SSD.
There is a need in the art for improved techniques for managing modified data and metadata in caching devices.
Embodiments are described by way of example, with reference to the accompanying drawings, which are not drawn to scale, in which like reference numerals refer to similar elements.
Described embodiments provide a non-volatile memory (NVM) cache to cache data for a storage, where the NVM cache supports extended size blocks larger than a standard size block of blocks stored in a storage for which data is being cached by the NVM cache. The extended size blocks in the NVM cache may store both blocks of data from the storage, having the standard first size block, and metadata for the blocks of data. Further, when storing both user blocks of data, of a first or standard sector size, e.g., 512 bytes or 4 KB, and metadata together in a larger second (extended) size block configured for logical addresses in the NVM cache, read and write commands may be provided that access blocks of data and/or metadata from the second size blocks having both blocks of data and metadata.
By allowing for metadata only reads and only writes, data transfer time and resource consumption for synchronization operations is substantially less than systems that read the blocks of data with their metadata. Yet further, since metadata and blocks of data may be written into the NVM cache, such as an SSD, in a single atomic operation, performance is improved while maintaining power fail safety because both the blocks of data and their metadata are preserved in the NVM cache in an atomic operation.
In the following description, numerous specific details such as logic implementations, opcodes, means to specify operands, resource partitioning/sharing/duplication implementations, types and interrelationships of system components, and logic partitioning/integration choices are set forth in order to provide a more thorough understanding of the present invention. It will be appreciated, however, by one skilled in the art that the invention may be practiced without such specific details. In other instances, control structures, gate level circuits and full software instruction sequences have not been shown in detail in order not to obscure the invention. Those of ordinary skill in the art, with the included descriptions, will be able to implement appropriate functionality without undue experimentation.
References in the specification to “one embodiment,” “an embodiment,” “an example embodiment,” etc., indicate that the embodiment described may include a particular feature, structure, or characteristic, but every embodiment may not necessarily include the particular feature, structure, or characteristic. Moreover, such phrases are not necessarily referring to the same embodiment. Certain embodiments relate to storage device electronic assemblies. Embodiments include both devices and methods for forming electronic assemblies.
The NVM cache 114 provides a cache memory for data in the storage 116. A cache mapping may be used to map the cached data in the NVM cache 114 to the block addresses in the storage 116.
The NVM cache 114 may be comprised of a solid state drive (SSD) or other type of non-volatile memory device allowing for an extended sector size to store at logical addresses both user data and metadata. For instance, the NVM cache 114 may comprise non-volatile memory types, such as a Flash Memory (NAND dies of flash memory cells), a non-volatile dual in-line memory module (NVDIMM), DIMM, Static Random Access Memory (SRAM), ferroelectric random-access memory (FeTRAM), nanowire-based non-volatile memory, three-dimensional (3D) cross-point memory, phase change memory (PCM), memory that incorporates memristor technology, Magnetoresistive random-access memory (MRAM), Spin Transfer Torque (STT)-MRAM and other electrically erasable programmable read only memory (EEPROM) type devices.
The cache manager 104 determines whether data requested by the processor 102 is in the NVM cache 114, and if not, the cache manager 104 fetches the requested data from the storage 116 to stage into the NVM cache 114. If the requested data is in the NVM cache 114, then it is returned from the faster access NVM cache 114.
The system 100 may also communicate with Input/Output (I/O) devices, which may comprise input devices (e.g., keyboard, touchscreen, mouse, etc.), display devices, graphics cards, ports, network interfaces, etc.
In
In certain embodiments, the block of data in the cache 200 lines and for the logical addresses in the storage 116 may comprise a standard sector size of data, such as 512 bytes or 4Kb (4096 bytes). The NVM cache 114 may comprise a SSD implementing an extended sector or block size greater than the standard block size. One example of an SSD implementing such an extended sector size is the variable sector size feature of DC P3700 SSDs produced by Intel Corporation.
The main memory 110 stores an operating system 120 executed by the processor and a cache driver 122 used to communicate with the NVM cache 114 via commands sent over the bus 118, and includes a data buffer 124 and a metadata buffer 126 used to transfer blocks of data and metadata for the cache lines 200 in the NVM cache 114. The cache driver 122 may be part of or separate from the operating system 120.
The NVM cache 114 includes an NVM controller 130 to execute Input/Output (I/O) commands to transfer blocks of data between the cache manager 104 and a plurality of storage dies 132, implementing storage cells that may be organized into pages of storage cells, where the pages are organized into blocks. In an embodiment, the NVM cache 114 may comprise an SSD of NAND storage dies 132. The NVM controller 130 performs logical-to-physical mapping and provides a mapping of logical addresses to which I/O requests are directed and physical addresses in the storage dies 132. The cache 112 may be stored in one or more of the storage dies 132.
With the described embodiments, to transfer blocks of data and metadata, the cache manager 104 notifies the cache driver 122 of the logical address range to transfer, and the cache driver 122 handles the transfer to the NVM cache 114.
The described embodiment commands utilize the extended sector size of data in the NVM cache 114, such that the block of data may comprise a standard first sector size and the block of data and metadata may be stored together in a second size block of the extended (second) sector size in the NVM cache 114. The described commands allow the request of metadata and block of data separately and together to provide a more efficient use of access to minimize data transfer operations, processor utilization and bus bandwidth.
Writing data in non-interleaved mode requires the cache manager 104 to interleave the blocks of data with the metadata, which requires allocating a buffer in the memory 110 large enough to store a second size block of data and metadata. This interleaving requires additional operations and processor utilization. The non-interleaved mode transfers the metadata and blocks of data separately and does not require the interleaving of the data.
Upon receiving (at block 1310) one or more instances of blocks of data for one of the read data only commands 300, the cache manager 104 calls the storage driver 128 to write (at block 1312) the read logical blocks of data, having the first block (or sector) size, to the storage 116. For each written block of data to the storage 116, the cache manager 104 generates (at block 1314) modified metadata for the block of data written to the storage 116 indicating that the read block of data for the logical address is clean (e.g., consistent with the data for the block in the storage 116), such as in the dirty flag 204. Each of the generated modified metadata instances are stored in the metadata buffer 126. The cache manager 104 calls the cache driver 122 to generate (at block 1318) a write metadata only command 500 indicating the start logical address 604 of the block of data associated with the first generated metadata, number of consecutive logical addresses 606 for which modified metadata is generated, and the location of the metadata buffer 608. The generated write metadata command 500 is sent to the NVM cache 114 to update the metadata for the cached logical addresses to indicate that the blocks of data for the logical addresses are unmodified, as the data has been synchronized to the storage 116.
With the described read and write metadata and data only commands, the cache manager 104 may limit bandwidth and processor cycles by limiting access to a granular level of access to metadata or data only to reduce the amount of data that needs to be retrieved to perform recovery, synchronization, and cache restore operations.
It should be appreciated that reference throughout this specification to “one embodiment” or “an embodiment” means that a particular feature, structure or characteristic described in connection with the embodiment is included in at least one embodiment of the present invention. Therefore, it is emphasized and should be appreciated that two or more references to “an embodiment” or “one embodiment” or “an alternative embodiment” in various portions of this specification are not necessarily all referring to the same embodiment. Furthermore, the particular features, structures or characteristics may be combined as suitable in one or more embodiments of the invention.
Similarly, it should be appreciated that in the foregoing description of embodiments of the invention, various features are sometimes grouped together in a single embodiment, figure, or description thereof for the purpose of streamlining the disclosure aiding in the understanding of one or more of the various inventive aspects. This method of disclosure, however, is not to be interpreted as reflecting an intention that the claimed subject matter requires more features than are expressly recited in each claim. Rather, as the following claims reflect, inventive aspects lie in less than all features of a single foregoing disclosed embodiment. Thus, the claims following the detailed description are hereby expressly incorporated into this detailed description.
The reference characters used herein, such as i, are used herein to denote a variable number of instances of an element, which may represent the same or different values, and may represent the same or different value when used with different or the same elements in different described instances.
The following examples pertain to further embodiments.
Example 1 is an apparatus to cache blocks of data and metadata for a storage, wherein the storage stores blocks having a first block size, comprising: a non-volatile memory (NVM) cache to cache blocks of data from the storage of the first block size and metadata for each of the cached blocks of data indicating a status of the cached block of data, including whether the block of data is modified or unmodified, and a location in the storage where the block of data is stored, wherein the non-volatile memory has blocks of a second block size greater than the first block size, wherein one of the blocks in the non-volatile memory stores the block of data from the storage and the metadata for the block of data; a cache manager to: write both the block of data and the metadata for the block of data to one of the blocks in the non-volatile memory cache; and write the block of data in one of the blocks in the non-volatile memory cache to the storage.
In Example 2, the subject matter of examples 1 and 3-10 can optionally include that the metadata for each of the blocks of data indicates whether the block of data in the NVM cache is valid or invalid and modified or unmodified.
In Example 3, the subject matter of examples 1, 2 and 4-10 can optionally include that the block of data and the metadata in the second size block are written to the NVM cache in a single write operation.
In Example 4, the subject matter of examples 1-3 and 5-10 can optionally include that the cache manager is further to: issue a read only metadata command to read only metadata from the NVM cache; determine from the read metadata the blocks of data that are indicated as modified; read only the blocks of data from the NVM cache determined from the read metadata to have modified data; and write only the read block of data to the location in the storage indicated in the read metadata for the read block of data.
In Example 5, the subject matter of examples 1-4 and 6-10 can optionally include that the cache manager is further to: generate modified metadata for the read block of data written to the storage indicating that the read block of data is unmodified; and write only the modified metadata for the read block of data written to the storage to the NVM cache.
In Example 6, the subject matter of examples 1-5 and 7-10 can optionally include that the issue the read only metadata command, the determine from the read metadata, the read only the blocks of data, and the write only the read blocks of data are performed in response to a recovery operation to recover from a power failure at the apparatus.
In Example 7, the subject matter of examples 1-6 and 8-10 can optionally include that the issue the read only metadata command, the determine from the read metadata, the read only the blocks of data, and the write only the read blocks of data are performed as part of a synchronization operation to write modified blocks of data in the NVM cache to the storage.
In Example 8, the subject matter of examples 1-7 and 9-10 can optionally include that the write the second size block having the block of data and the metadata comprises: write to a write buffer blocks of data for multiple cache lines in the NVM cache and the metadata for the blocks of data, wherein the blocks of data are interleaved with the metadata, wherein each block of data interleaved with the metadata combined has the second block size; generate a write command indicating a number of the second size blocks to write and the write buffer with the blocks of data interleaved with the metadata for multiple second size blocks; and transmit the write command to the NVM cache to cause the NVM cache to access the blocks of data interleaved with the metadata in the indicated write buffer to write the second size blocks to the NVM cache.
In Example 9, the subject matter of examples 1-8 and 10 can optionally include that the write the second size block having the block of data and the metadata comprises: write to a data buffer blocks of data for multiple cache lines in the NVM cache; write to a metadata buffer the metadata for the blocks of data written to the data buffer; generate a write command indicating a number of the second size blocks to write, the data buffer, and the metadata buffer; and transmit the write command to the NVM cache to cause the NVM cache to, for each of the number of the second size blocks to write, access the block of data from the indicated write buffer and the metadata from the indicated metadata buffer to write together to a second size block in the NVM cache.
In Example 10, the subject matter of examples 1-9 can optionally include that the NVM cache comprises at least one solid state drive (SSD) configured with blocks having the second block size which comprises an extended block size to configure for the at least one solid state drive, and wherein the NVM cache is a faster access device than the storage.
Example 11 is a system to cache blocks of data and metadata, comprising: a processor; a storage; a non-volatile memory (NVM) cache to cache blocks of data from the storage of the first block size and metadata for each of the cached blocks of data indicating a status of the cached block of data, including whether the block of data is modified or unmodified, and a location in the storage where the block of data is stored, wherein the non-volatile memory has blocks of a second block size greater than the first block size, wherein one of the blocks in the non-volatile memory stores the block of data from the storage and the metadata for the block of data; a cache manager executed by the processor to: write both the block of data and the metadata for the block of data to one of the blocks in the non-volatile memory cache; and write the block of data in one of the blocks in the non-volatile memory cache to the storage.
In Example 12, the subject matter of claims 11 and 13-18 can optionally include that the metadata for each of the blocks of data indicates whether the block of data in the NVM cache is valid or invalid and modified or unmodified.
In Example 13, the subject matter of claims 11, 12 and 14-18 can optionally include that the block of data and the metadata in the second size block are written to the NVM cache in a single write operation.
In Example 14, the subject matter of claims 11-13 and 15-18 can optionally include that the cache manager is further to: issue a read only metadata command to read only metadata from the NVM cache; determine from the read metadata the blocks of data that are indicated as modified; read only the blocks of data from the NVM cache determined from the read metadata to have modified data; and write only the read block of data to the location in the storage indicated in the read metadata for the read block of data.
In Example 15, the subject matter of claims 11-14 and 16-18 can optionally include that the cache manager is further to: generate modified metadata for the read block of data written to the storage indicating that the read block of data is unmodified; and write only the modified metadata for the read block of data written to the storage to the NVM cache.
In Example 16, the subject matter of claims 11-15 and 17-18 can optionally include that the write the second size block having the block of data and the metadata comprises: write to a write buffer blocks of data for multiple cache lines in the NVM cache and the metadata for the blocks of data, wherein the blocks of data are interleaved with the metadata, wherein each block of data interleaved with the metadata combined has the second block size; generate a write command indicating a number of the second size blocks to write and the write buffer with the blocks of data interleaved with the metadata for multiple second size blocks; and transmit the write command to the NVM cache to cause the NVM cache to access the blocks of data interleaved with the metadata in the indicated write buffer to write the second size blocks to the NVM cache.
In Example 17, the subject matter of claims 11-16 and 18 can optionally include that the write the second size block having the block of data and the metadata comprises: write to a data buffer blocks of data for multiple cache lines in the NVM cache; write to a metadata buffer the metadata for the blocks of data written to the data buffer; generate a write command indicating a number of the second size blocks to write, the data buffer, and the metadata buffer; and transmit the write command to the NVM cache to cause the NVM cache to, for each of the number of the second size blocks to write, access the block of data from the indicated write buffer and the metadata from the indicated metadata buffer to write together to a second size block in the NVM cache.
In Example 18, the subject matter of claims 11-17 can optionally include that the NVM cache comprises at least one solid state drive (SSD) configured with blocks having the second block size which comprises an extended block size to configure for the at least one solid state drive, and wherein the NVM cache is a faster access device than the storage.
Example 19 is a method for caching blocks of data and metadata for a storage in a computer system, comprising: caching, in a non-volatile memory (NVM) cache, blocks of data from the storage of the first block size and metadata for each of the cached blocks of data indicating a status of the cached block of data, including whether the block of data is modified or unmodified, and a location in the storage where the block of data is stored, wherein the non-volatile memory has blocks of a second block size greater than the first block size, wherein one of the blocks in the non-volatile memory stores the block of data from the storage and the metadata for the block of data; writing both the block of data and the metadata for the block of data to one of the blocks in the non-volatile memory cache; and writing the block of data in one of the blocks in the non-volatile memory cache to the storage.
In Example 20, the subject matter of claims 19 and 21-25 can optionally include that the block of data and the metadata in the second size block are written to the NVM cache in a single write operation.
In Example 21, the subject matter of claims 19, 20 and 22-25 can optionally include that issuing a read only metadata command to read only metadata from the NVM cache; determining from the read metadata the blocks of data that are indicated as modified; reading only the blocks of data from the NVM cache determined from the read metadata to have modified data; and writing only the read block of data to the location in the storage indicated in the read metadata for the read block of data.
In Example 22, the subject matter of claims 19-21 and 23-25 can optionally include generating modified metadata for the read block of data written to the storage indicating that the read block of data is unmodified; and writing only the modified metadata for the read block of data written to the storage to the NVM cache.
In Example 23, the subject matter of claims 19-22 and 24-25 can optionally include that the writing the second size block having the block of data and the metadata comprises: writing to a write buffer blocks of data for multiple cache lines in the NVM cache and the metadata for the blocks of data, wherein the blocks of data are interleaved with the metadata, wherein each block of data interleaved with the metadata combined has the second block size; generating a write command indicating a number of the second size blocks to write and the write buffer with the blocks of data interleaved with the metadata for multiple second size blocks; and transmitting the write command to the NVM cache to cause the NVM cache to access the blocks of data interleaved with the metadata in the indicated write buffer to write the second size blocks to the NVM cache.
In Example 24, the subject matter of claims 19-23 and 25 can optionally include that the writing the second size block having the block of data and the metadata comprises: writing to a data buffer blocks of data for multiple cache lines in the NVM cache; writing to a metadata buffer the metadata for the blocks of data written to the data buffer; generating a write command indicating a number of the second size blocks to write, the data buffer, and the metadata buffer; and transmitting the write command to the NVM cache to cause the NVM cache to, for each of the number of the second size blocks to write, access the block of data from the indicated write buffer and the metadata from the indicated metadata buffer to write together to a second size block in the NVM cache.
In Example 25, the subject matter of claims 19-24 can optionally include that the NVM cache comprises at least one solid state drive (SSD) configured with blocks having the second block size which comprises an extended block size to configure for the at least one solid state drive, and wherein the NVM cache is a faster access device than the storage.
In Example 26, the subject matter as claimed in claim 19, further comprising at least any one of:
(1) wherein the metadata for each of the blocks of data indicates whether the block of data in the NVM cache is valid or invalid and modified or unmodified; and/or
(2) wherein the block of data and the metadata in the second size block are written to the NVM cache in a single write operation; and/or
(3) issuing a read only metadata command to read only metadata from the NVM cache; determining from the read metadata the blocks of data that are indicated as modified; reading only the blocks of data from the NVM cache determined from the read metadata to have modified data; and writing only the read block of data to the location in the storage indicated in the read metadata for the read block of data; and/or
(4) generating modified metadata for the read block of data written to the storage indicating that the read block of data is unmodified; and writing only the modified metadata for the read block of data written to the storage to the NVM cache; and/or
(5) wherein the issue the read only metadata command, the determine from the read metadata, the read only the blocks of data, and the write only the read blocks of data are performed in response to a recovery operation to recover from a power failure at the apparatus; and/or
(6) wherein the issue the read only metadata command, the determine from the read metadata, the read only the blocks of data, and the write only the read blocks of data are performed as part of a synchronization operation to write modified blocks of data in the NVM cache to the storage; and/or
(7) wherein the writing the second size block having the block of data and the metadata comprises: writing to a write buffer blocks of data for multiple cache lines in the NVM cache and the metadata for the blocks of data, wherein the blocks of data are interleaved with the metadata, wherein each block of data interleaved with the metadata combined has the second block size; generating a write command indicating a number of the second size blocks to write and the write buffer with the blocks of data interleaved with the metadata for multiple second size blocks; and transmitting the write command to the NVM cache to cause the NVM cache to access the blocks of data interleaved with the metadata in the indicated write buffer to write the second size blocks to the NVM cache; and/or
(8) wherein the writing the second size block having the block of data and the metadata comprises: writing to a data buffer blocks of data for multiple cache lines in the NVM cache; writing to a metadata buffer the metadata for the blocks of data written to the data buffer; generating a write command indicating a number of the second size blocks to write, the data buffer, and the metadata buffer; and transmitting the write command to the NVM cache to cause the NVM cache to, for each of the number of the second size blocks to write, access the block of data from the indicated write buffer and the metadata from the indicated metadata buffer to write together to a second size block in the NVM cache; and/or
(9) wherein the NVM cache comprises at least one solid state drive (SSD) configured with blocks having the second block size which comprises an extended block size to configure for the at least one solid state drive, and wherein the NVM cache is a faster access device than the storage.
Example 27 is an apparatus for caching blocks of data and metadata for a storage in a computer system, comprising: means for caching, in a non-volatile memory (NVM) cache, blocks of data from the storage of the first block size and metadata for each of the cached blocks of data indicating a status of the cached block of data, including whether the block of data is modified or unmodified, and a location in the storage where the block of data is stored, wherein the non-volatile memory has blocks of a second block size greater than the first block size, wherein one of the blocks in the non-volatile memory stores the block of data from the storage and the metadata for the block of data; means for writing both the block of data and the metadata for the block of data to one of the blocks in the non-volatile memory cache; and means for writing the block of data in one of the blocks in the non-volatile memory cache to the storage.
Example 28 is a machine-readable storage including machine-readable instructions, when executed, to implement a method or realize an apparatus as claimed in any preceding claim.