Solid State Drives (SSDs) are non-volatile data storage devices that are used for persistent data storage, but unlike hard disks drives, contain no moving parts. Some SSD drives use flash memory, which can retain data without being powered. One drawback of flash memory is that each memory cell of a flash-based SSD can be written only a limited number of times before the memory cell fails. To extend the life of flash-based SSDs, various techniques are employed to extend the life of drive, such as wear leveling, which spreads write operations more evenly across the memory cells of the drive.
Certain exemplary embodiments are described in the following detailed description and in reference to the drawings, in which:
One characteristic of flash memory is that flash memory cells cannot be directly overwritten. Thus, when data is written to an SSD memory cell, the cell must first be erased and then written. In some cases, this may result in two writes for each actual bit of data to be stored to the device. In most flash memory, data is written in units called pages, but data is erased in larger units called blocks. If enough data within a block is unneeded (i.e., stale pages), the entire block is erased and any good data in the block is re-written to a new block. The remainder of the new block that is left over can be written with new data. This process of erasing blocks and moving good data to new blocks is referred to as “garbage collection.” Most SSDs include some amount of storage space that is reserved for garbage collection, wear-leveling, and remapping bad blocks, among other things. The difference between the physical amount of storage capacity and the logical capacity presented to the user is referred to as over-provisioning.
Techniques such as wear-leveling and garbage collection effect the drive's write amplification, which is a phenomenon in which the actual amount of physical information written within the drive is a multiple of the logical amount of data intended to be written. The higher the write amplification of a drive, the more writes that the cells of the drive will experience for a given amount of data storage usage.
The present disclosure provides techniques for reducing the write-amplification of a flash drive by increasing the amount of storage space on the drive available for over-provisioning. Providing more storage space for over-provisioning improves the efficiencies of the wear-leveling and garbage collection algorithms, which have a direct effect on write amplification.
Many storage systems utilize an array of drives and provide fault tolerance by storing data with redundancy. The failure of a drive can cause a controller to identify a drive as failed and initiate a spare rebuild process that regenerates the data of the failed drive from the other drives. Meanwhile, the bad drive can be replaced by the customer. In such a system, a certain amount of the system's storage space is reserved for the spare rebuild process. For example, some systems may include one or more whole drives that are reserved as spare drives to be used in the event of drive a failure. In some systems, the storage resources used for spare rebuild are distributed across several drives. In such a storage system, multiple drives of the storage system can include a certain amount of storage space that is reserved as spare storage, while most of the remaining storage space is free space, which is used for storing host data.
As mentioned above, the write amplification of a storage drive can be reduced by providing more storage space for over-provisioning. The techniques disclosed herein provide more storage space for over-provisioning by enabling a drive to use the storage space designated as “spare” storage space for over-provisioning when not in use for a spare rebuild operation. Providing more storage for over-provisioning reduces the write amplification of the drive, which reduces the overall number of writes the memory cells of the drive experience and thereby extends the useful life of the drive.
The storage system 100 provides data storage resources to any number of client computers 102, which may be general purpose computers, workstations, mobile computing devices, and the like. The storage system 100 includes storage controllers, referred to herein as nodes 104. The storage system 100 also includes storage arrays 106, which are controlled by the nodes 104. The client computers 102 can be coupled to the storage system 100 directly or through a network 108, which may be a local area network (LAN), wide area network (WAN), a storage area network (SAN), or other suitable type of network.
The client computers 102 can access the storage space of the storage arrays 106 by sending Input/Output (I/O) requests, including write requests and read requests, to the nodes 104. The nodes 104 process the I/O requests so that user data is written to or read from the appropriate storage locations in the storage arrays 106. As used herein, the term “user data” refers to data that a person might use in the course of business, performing a job function, or for personal use, such as business data and reports, Web pages, user files, image files, video files, audio files, software applications, or any other similar type of data that that a user may wish to save to long term storage. Each of the nodes 104 can be communicatively coupled to each of the storage arrays 106. Each node 104 can also be communicatively coupled to each other node by an inter-node communication network 110.
The storage arrays 106 may include various types of persistent storage, including solid state drives 112, which may be referred to herein simply as drives 112. In some examples, the drives 112 are flash drives. However, the drives 112 may also use other types of persistent memory, including resistive memory, for example. Each storage array 106 includes multiple drives 112, each of which is configured so that a certain amount of storage space on each drive is designated as spare storage to be used for spare rebuild operations. The amount of storage space designated as spare storage may be a parameter set by the nodes 104. The storage network system 100 may also include additional storage devices in addition to what is shown in
Each node 104 can include a spare rebuild engine 114 that performs spare rebuild operations. The spare rebuild engine 114 can be implemented in hardware or a combination of hardware and programming code. For example, the spare rebuild engine 114 can include a non-transitory, computer-readable medium for storing instructions, one or more processors for executing the instructions, or a combination thereof. In some examples, the spare rebuild engine 114 is implemented as computer-readable instructions stored on an integrated circuit such as an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA), or other type of processor.
The spare rebuild engine 114 is configured to rebuild drive data. If the node 104 detects a failure condition, the node 104 can trigger the spare rebuild engine 114 to conduct a spare rebuild operation. During the spare rebuild operation, the data of the failed drive is re-created on the storage space designated as spare storage space. The spare rebuild engine 114 can use any suitable technique for rebuilding the data of the failed drive on the spare storage space.
Each node 104 can also include a memory allocation controller 116 that controls the allocation of storage space in the storage arrays 106 between spare storage and over-provisioning storage. The memory allocation controller 116 can be implemented as part of the spare rebuild engine 114 or as a separate component. Each node 104 controls the memory allocation for a certain sub-set of the drives 112. In some examples, the storage system 100 can be configured so that each node 104 may control all the drives of a specific storage array 106. For example, node A can be configured to control the drives 112 in storage array A, node B can be configured to control the drives 112 in storage array B, and node C can be configured to control the drives 112 in storage array C. Other arrangements are also possible depending on the design considerations of a particular implementation. Additionally, certain details of the storage system configuration can be specified by an administrator, including the amount of storage space used for spare storage and which nodes 104 control which drives 112, for example.
As mentioned above, several or all of the drives 112 have a certain amount of storage space that is designated as spare storage space. During normal operation of a storage array 106 the memory allocation controller 116 of the corresponding node 104 can instruct each drive 112 under its control to use the spare storage for over-provisioning. If a spare rebuild operation is initiated, the memory allocation controller 116 instructs those drives 112 involved in the spare rebuild operation to stop using the spare storage space for over-provisioning, and the space becomes available for the spare rebuild operation. When the spare rebuild operation is complete and the spare storage space is no longer being used, the memory allocation controller 116 instructs the drives 112 to use the spare storage space for over-provisioning again.
Additionally, some portion of the storage space is mapped for internal use by the storage system 100. Examples of this use could be, but is not limited to, drive identification labels, storage system table of content identifiers, or diagnostic and test areas.
Some portion of the storage space is mapped as an internal over-provisioning region 210. The internal over-provisioning region 210 is a reserved by the drive itself for over-provisioning processes such as garbage collection, wear-leveling, bad block remapping, and the like. The over-provisioning processes are performed by the drive itself. For example, a processor of the drive can run firmware programmed for the over-provisioning processes. In some examples, the over-provisioning region 210 is not visible to or accessible by an external device storage controller, such as the nodes 104 of
As shown in the memory map 202, some portion of the drive's visible storage space is designated as a spare storage space region 214. The spare storage space region 214 is accessible to the nodes 104 and is reserved by and controlled by the nodes 104. The configuration of the spare storage space region 214 can be determined by an administrator of the storage system 100. For example, the administrator can specify how much spare storage space is set aside by each drive. This same region is shown in memory map 200 as a free space 212. At the time that the drive is brought on-line, the node in control of the drive can instruct the drive to use the spare storage space region 214 as free space 212 to be used for over-provisioning. The drive firmware is configured to be able to receive an instruction to use normally visible storage space for over-provisioning. Thus, the storage space used for over-provisioning is not fixed by the manufacturer of the drive. If a spare rebuild operation is initiated, the node in control of the drive can reclaim the free-space 212 by instructing the drive to stop using the spare storage space region 214 for over-provisioning.
At block 302, a visible region of storage space on the drive is designated as spare storage space. This operation can be performed in accordance with input received from a system administrator.
At block 304, the drive is instructed to use the spare storage space for over-provisioning operations. As explained above, over-provisioning operations are operations that reduce write amplification in a drive and extend the useful life of the drive. Such over-provisioning operations include wear-leveling and garbage collecting.
At block 306, data storage requests are sent to the drive. The data storage requests include write requests and read requests that are received during the regular operation of the storage system. For example, the data storage requests may be I/O requests received from a client computer requesting user data to be stored to a region of the drive reserved for user data. During the regular operation of the storage system, the drive will perform over-provisioning operations using both the spare storage space and a fixed internal storage space that is reserved by the drive for over-provisioning operations. In some examples, the storage space that is reserved by the drive for over-provisioning operations is a hidden storage space that is not accessible to the external storage controller.
At block 308, some event or action occurs that triggers the use of the spare storage space. For example, a failure of a drive may be detected or an administrator may perform an action that triggers the need for spare storage space.
At block 310, the solid state drive is instructed to stop using all or a portion of the spare storage space for over-provisioning operations. The drive will then stop using the spare storage space for over-provisioning operations and move any needed data from the spare storage space to another region of the drive, such as the hidden storage space. In some examples, after the drive has released the spare storage space and moved any needed data, the drive may send an acknowledgment to the controller indicating that the spare storage space is available.
At block 312, the spare storage space is used to perform a spare rebuild. In some examples, the controller may wait for the drive to acknowledge that the spare storage space is available.
At block 314, the spare rebuild is finished, and any data that was stored to the spare storage space has been erased or is no longer needed. At this time, the drive is instructed to resume using the spare storage space for over-provisioning operations.
The various software components discussed herein may be stored on the computer-readable medium 400. A region 406 on the computer-readable medium 400 can include an I/O processing engine that processes I/O requests received from a client computer. For example, processing I/O requests can include storing data to a storage drive or retrieving data from a storage drive and sending it to a client computer that requested it. A region 408 can include a spare rebuild engine to rebuild the data of a failed disk on spare storage space of one or more drives. A region 410 can include a memory allocation controller configured to designate a visible region of storage space on a drive as spare storage space. The memory allocation controller can also instruct the drive to use the spare storage space for the over-provisioning operations when not being used for a spare rebuild operation. Although shown as contiguous blocks, the software components can be stored in any order or configuration. For example, if the tangible, non-transitory, computer-readable medium is a hard drive, the software components can be stored in non-contiguous, or even overlapping, sectors.
While the present techniques may be susceptible to various modifications and alternative forms, the exemplary examples discussed above have been shown only by way of example. It is to be understood that the technique is not intended to be limited to the particular examples disclosed herein. Indeed, the present techniques include all alternatives, modifications, and equivalents falling within the true spirit and scope of the appended claims.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/US2014/031327 | 3/20/2014 | WO | 00 |