The invention relates generally to data storage, and more specifically to caching.
In a storage system, a host transmits requests to a storage controller in order to store or retrieve data. The host requests can indicate that data should be written to, or read from, one or more Logical Block Addresses (LBAs) of a logical volume. The storage controller processes incoming host requests to correlate the requested LBAs with physical addresses on one or more storage devices that store data for the volume. The storage controller can translate a host request into individual Input/Output (I/O) operations that are each directed to a storage device for the logical volume, in order to retrieve or store data at the correlated physical addresses. Storage controllers are just one example of the many electronic devices that utilize caches in order to enhance their overall speed of processing.
Systems and methods herein provide for enhanced cache flushing techniques that use linked lists to determine which lines of dirty (unsynchronized) cache data should be flushed from a write cache to a storage device, in order to synchronize the storage device with the cache. In one embodiment, a linked list can be ordered in a manner that ensures lines of the cache are flushed to a storage device in ascending or descending order of block address. This provides a substantial decrease in latency when a large number of cache lines are flushed to a storage device comprising a spinning hard disk.
One exemplary embodiment is a system that includes a memory, an interface, and an Input/Output (I/O) processor. The memory implements a cache divided into multiple cache lines, and the interface is able to receive I/O directed to a block address of a storage device. The I/O processor is able to determine a remainder by dividing the block address by the number of cache lines, and to select a cache line for storing the I/O based on the remainder. The I/O processor is further able to determine a quotient by dividing the block address by the number of cache lines, and to associate the quotient with the selected cache line. Additionally, the I/O processor is able to populate a linked list by inserting entries into the linked list that each point to a different cache line associated with the same quotient, and to flush the cache lines to the storage device in block address order by traversing the entries of the linked list.
Other exemplary embodiments (e.g., methods and computer readable media relating to the foregoing embodiments) are also described below.
Some embodiments of the present invention are now described, by way of example only, and with reference to the accompanying figures. The same reference number represents the same element or the same type of element on all figures.
The figures and the following description illustrate specific exemplary embodiments of the invention. It will thus be appreciated that those skilled in the art will be able to devise various arrangements that, although not explicitly described or shown herein, embody the principles of the invention and are included within the scope of the invention. Furthermore, any examples described herein are intended to aid in understanding the principles of the invention, and are to be construed as being without limitation to such specifically recited examples and conditions. As a result, the invention is not limited to the specific embodiments or examples described below, but by the claims and their equivalents.
Caching system 100 provides a benefit over prior systems, because it utilizes linked lists to direct the order of flushing operations at a cache. This provides two substantial benefits. First, a linked list can be used to flush cache lines of data to a storage device in either ascending or descending block address order, which ensures that the storage device can quickly write I/O from the cache, particularly when the storage device utilizes a spinning disk recording medium. Second, a linked list can use substantially less memory overhead (e.g., Double Data Rate (DDR) Random Access Memory (RAM) overhead) than an Adelson-Velsky and Landis (AVL) tree, a Red-Black (RB) tree, or similar binary tree structures. For example, a tree structure may require three four byte pointers per entry, while the linked lists described herein may use one per entry. In embodiments where a cache is divided into millions of cache lines, this reduced overhead can provide substantial space savings for the memory implementing the cache (e.g., DDR RAM).
According to
I/O processor 116 can be implemented as custom circuitry, a processor executing programmed instructions stored in program memory, or some combination thereof. Memory 118 comprises a storage medium for retaining data to be flushed to storage device 120. Memory 118 can benefit from properties such as increased bandwidth and reduced latency. For example, in one embodiment memory 118 comprises a solid-state flash memory, while in another embodiment memory 118 comprises a Non-Volatile Random Access Memory (NVRAM) that is backed up by an internal battery. Implementing memory 118 as a non-volatile storage medium provides enhanced data integrity.
In this embodiment, storage device 120 implements the persistent storage capacity of storage system 100 and is capable of storing data in a computer readable format. For example, storage device 120 can comprise a magnetic hard disk, a solid state drive, an optical medium, etc. The various components of
Storage devices 252, 254, and 256 implement storage space for the logical RAID volume 250. As discussed herein, a logical volume comprises allocated storage space and data available at operating environment 200. A logical volume can be implemented on any number of storage devices as a matter of design choice. Furthermore, the storage devices need not be dedicated to only one logical volume, but can also store data for a number of other logical volumes. Implementing a logical volume as a RAID volume enhances the performance and/or reliability of stored data.
The particular arrangement, number, and configuration of components described herein is exemplary and non-limiting. Additional caching systems and techniques are described in detail at U.S. patent application Ser. No. 14/337,409, titled “SELECTIVE MIRRORING IN CACHES FOR LOGICAL VOLUMES,” filed on Jul. 22, 2014, which is herein incorporated by reference.
Further details of the operation of caching device 110 are described in detail with regard to
In step 304, I/O processor 116 determines a remainder number, by dividing the block address by the number of cache lines in the cache at memory 118. For example, a modulo operation can be performed to determine the remainder. This remainder is used to determine which cache line will store the data for the block address.
In step 306, I/O processor 116 selects a cache line in memory 118 for storing the I/O, based on the remainder determined in step 304. In this embodiment, the cache lines are numbered in the cache in sequence, and step 306 comprises selecting a corresponding cache line with the number that equals the remainder. This means that each of the cache lines is reserved for storing a set of block addresses that have a common remainder when divided by the number of cache lines. In one embodiment, if the corresponding cache line is dirty and already occupied with data waiting to be flushed to storage device 120, then I/O processor 116 reviews a threshold number of cache lines that follow the cache line, and inserts the I/O into the first empty cache line that it finds. For example, I/O processor 116 can review the fifteen cache lines that follow the corresponding cache line, and select the first empty cache line that is found.
After a cache line has been selected, I/O processor 116 stores the I/O at the selected cache line. When the I/O is large enough to occupy multiple cache lines, this can further comprise storing the I/O at the selected cache line as well as cache lines that immediately follow the selected cache line.
In step 308, I/O processor 116 determines a quotient by dividing the block address by the number of cache lines. The quotient is the integer result of the division. Step 308 does not necessarily require dividing the block address by the number of cache lines again, and may be determined when division is first performed in step 304. In step 310, I/O processor 116 associates the quotient with the selected cache line. In one embodiment, this comprises storing the quotient in a table/array that tracks the status of each cache line.
Steps 302-310 repeat each time new I/O is received for caching in memory 118. In this manner, the cache lines of memory 118 fill up with data for flushing to storage device 120. Steps 312-314 illustrate how a linked list can be used to flush data from the cache lines to storage device 120. Therefore, steps 312-314 can be performed substantially simultaneously and asynchronously with steps 302-310. Steps 312-314 utilize one or more linked lists that each correspond with a different quotient. That is, in steps 312-314, each linked list includes a set of entries that each correspond with a single cache line, and all of the entries of a linked list point to cache lines associated with the same quotient. The entries of each linked list are sorted in remainder order, meaning that the entries of each linked list are also sorted in block address order. When the linked lists are constructed in this manner, I/O processor 116 can quickly flush I/O in block address order by traversing the linked lists in quotient order.
In step 312, I/O processor 116 populates a linked list by inserting entries into the linked list that each point to a different cache line associated with the same quotient. In this embodiment, as described above, there are multiple linked lists (e.g., stored in memory 118) that each correspond with a different quotient. The linked lists can be populated by reviewing each cache line in the cache. For example, in one embodiment, I/O processor 116 reviews the dirty cache lines in sequence. For each cache line, I/O processor 116 determines the quotient for the cache line, and adds an entry for the cache line to the tail of the linked list corresponding to that quotient. I/O processor 116 can further link the tail entry of a linked list for a quotient to the head entry of a linked list for a next quotient. In this manner the linked lists form a continuous chain of entries in block address order for flushing data to storage device 120. This results from the cache lines storing I/O in remainder order, while being distributed across the linked lists in quotient order. In short, for a given linked list, the entries each point to a different cache line but are associated with the same quotient.
In step 314, I/O processor 116 flushes the cache lines to storage device 120 in block address order, by traversing the entries of the linked list. In embodiments where there is a linked list for each quotient, this result can be achieved by traversing the multiple linked lists in quotient order (e.g., ascending or descending). Flushing the cache lines in the order defined by the linked lists ensures that writes are applied to storage device 120 in block address order, which provides a performance benefit for storage devices that utilize spinning disks (such as magnetic hard disks).
Even though the steps of method 300 are described with reference to caching system 100 of
In the following examples, additional processes, systems, and methods are described in the context of a cache for a SAS storage controller. In this example, the storage controller receives host write requests that are directed to LBAs of a logical volume, and translates the write requests into SAS I/O operations directed to specific block addresses of individual storage devices. The storage controller utilizes a cache to store I/O for the write requests, and operates the cache in a write back mode to report successful completion of write requests to the host before those write requests are flushed to persistent storage. The cache itself comprises sixteen million cache lines, and each cache line is capable of storing a 64 Kilobyte (KB) block of data. The logical volume that the cache stores data for is a one terabyte logical volume. In this example, multiple caches are kept on a cache memory device, one for each logical volume. However, further discussion of this example is limited to the single cache for the single volume described above. Similar operations to those described in this example can be performed for each of the caches on the storage controller.
In this embodiment, each entry in a linked list includes the quotient that the entry is associated with, a pointer to a cache line, and a next pointer directed to a next entry in the linked list. When flushing cache lines to storage device 120, an I/O processor starts with the list pointer for Q0. If the list pointer is null, the I/O processor reviews the next list pointer (for Q1). Alternatively, if the list pointer for Q0 is not null, the I/O processor follows the list pointer to an entry in a linked list. The I/O processor flushes the cache line that the linked list entry points to, marks the cache line as “clean” (instead of dirty) and follows the next pointer of the linked list entry to visit the next entry of the linked list. The linked list entry for the flushed cache line is also removed. The I/O processor continues in this manner flushing cache lines and following next pointers. Since the next pointer for a tail entry of a linked list points to the head entry of the linked list for a next quotient, the I/O processor continues flushing cache entries (for potentially multiple quotients) until it finds a linked list entry with a null next pointer. At that point in time, the I/O processor determines the quotient of the current entry, and follows the list pointer for the next quotient in order to find the next linked list (or set of linked lists). Once the linked lists have been traversed and the cache lines flushed, their entries have been removed by I/O processor 116. The linked lists can therefore now be repopulated based on the current composition of the cache lines.
Next, in step 706 the I/O processor creates a new entry for the cache line, and sets the next pointer of the new entry. If the linked list for Q+1 is empty and has no entries (e.g., as indicated by the list pointer in the array entry for (Q+2)), then the next pointer for the new entry is set to null. Otherwise, the next pointer for the new entry is set to the head entry of the linked list for Q+1.
Next, if the list pointer in the array entry is not null in step 708, then a linked list exists for Q. Thus, in step 716, the I/O processor follows the list pointer in the array entry for Q+1, which points to the tail entry of the linked list for Q. The I/O processor then changes the next pointer of the tail entry to point to the new entry. This makes the new entry the tail entry for linked list Q. In step 718, the I/O processor updates the list pointer for the array entry for Q+1 to point to the newly created entry.
If the pointer in the array entry is currently null in step 708, then there is no linked list for Q, meaning that the cache line is the first detected cache line that is associated with Q. Thus, the I/O processor follows the list pointer for the array entry for Q in step 710, and determines whether the list pointer is null in step 712. If the list pointer for Q is null, then the previous linked list (the linked list for Q−1) is also empty and has no entries. Thus, in step 714, the I/O processor updates the list pointer for the array entry for Q+1 to point to the new entry.
If in step 712 the list pointer for the array entry for Q is not null, then a linked list already exists for the previous linked list (the linked list for Q−1). Thus, in step 720, the I/O processor follows the list pointer for the array entry for Q to a tail entry for the previous linked list, and adjusts the next pointer of that tail entry to point to the new entry. This effectively links the tail entry of the linked list for Q−1 to the new entry, which operates as the head for the linked list for Q. At step 722, the I/O processor further updates the list pointer for the array entry for Q+1 to point to the new entry.
Embodiments disclosed herein can take the form of software, hardware, firmware, or various combinations thereof. In one particular embodiment, software is used to direct a processing system of a caching device to perform the various operations disclosed herein.
Computer readable storage medium 812 can be an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor device. Examples of computer readable storage medium 812 include a solid state memory, a magnetic tape, a removable computer diskette, a random access memory (RAM), a read-only memory (ROM), a rigid magnetic disk, and an optical disk. Current examples of optical disks include compact disk-read only memory (CD-ROM), compact disk-read/write (CD-R/W), and DVD.
Processing system 800, being used for storing and/or executing the program code, includes at least one processor 802 coupled to program and data memory 804 through a system bus 850. Program and data memory 804 can include local memory employed during actual execution of the program code, bulk storage, and cache memories that provide temporary storage of at least some program code and/or data in order to reduce the number of times the code and/or data are retrieved from bulk storage during execution.
Input/output or I/O devices 806 (including but not limited to keyboards, displays, pointing devices, etc.) can be coupled either directly or through intervening I/O controllers. Network adapter interfaces 808 can also be integrated with the system to enable processing system 800 to become coupled to other data processing systems or storage devices through intervening private or public networks. Modems, cable modems, IBM Channel attachments, SCSI, Fibre Channel, and Ethernet cards are just a few of the currently available types of network or host interface adapters. Display device interface 810 can be integrated with the system to interface to one or more display devices, such as printing systems and screens for presentation of data generated by processor 802.