Embodiments described herein generally relate to data redundancy, and more specifically, relate to data redundancy in a solid-state drive.
Various techniques may be used to provide data redundancy. For example, a redundant array of independent disks (RAID) implementation may combine multiple disk drives (e.g., physical storage devices such as Hard Disk Drives (HDDs) and Solid State Drives (SSDs) into a single logical unit to provide data redundancy and improved performance. The HDDs or SSDs may be considered to be part of a single RAID array. The data of the logical unit may be stored or distributed across each of the disks of the RAID array. To provide redundancy for this data stored in the RAID array, parity data stored in the disks may be used by the RAID array. For example, if one of the disks in the RAID array fails, then the remaining data on the other disks may be combined with the parity data to reconstruct the missing data from the failed disk. A Boolean Exclusive OR (XOR) operation may be performed between the data and parity data on the remaining disks to reconstruct the missing data.
The disclosure will be understood more fully from the detailed description given below and from the accompanying drawings of various embodiments of the disclosure. The drawings, however, should not be taken to limit the disclosure to the specific embodiments, but are for explanation and understanding only.
Aspects of the present disclosure are directed to a die redundancy implementation based on logical units in a solid-state drive. In general, a die redundancy may be implemented within a single solid-state drive based on logical units. The solid-state drive may include a controller and non-volatile memory.
Data redundancy may be provided by the use of parity data and a Boolean exclusive-or (XOR) operation. For example, in a RAID array that is based on multiple distinct disk drives, data may be divided into various blocks that are distributed across the disk drives in the RAID array. The data blocks divided between the disk drives may be referred to as a stripe and each data block of the stripe may be referred to as a strip. As an example, a body of user data (i.e., logically sequential data such as a file corresponding to a stripe) may be distributed as a first user data (i.e., a first strip) that may be stored at a first disk drive of the RAID array, second user data (i.e., a second strip) may be stored at a second disk drive, and third user data (i.e., a third strip) may be stored at a third disk drive. The combination of the first user data, second user data, third user data, and parity data may correspond to a stripe. The parity data for the first, second, and third user data may be generated by performing the XOR operation on the first, second, and third user data. This parity data may be stored at another disk drive of the RAID array. If any of the disk drives that are storing the first, second, or third user data fails and the respective user data is lost as a result of the disk drive failing, then the lost user data may be reconstructed by using the parity data. For example, if the third disk drive fails and the third user data is not recoverable from the third disk drive, then the third user data may be reconstructed by performing an XOR operation between the first user data stored at the first disk drive, the second user data stored at the second disk drive, and the parity data stored at another disk drive of the RAID array. Thus, the user data in the RAID array may be recoverable even if one of the disk drives of the RAID array has failed or is non-functional.
In an embodiment, die redundancy is provided within a SSD using distributed parity data and data striping to store parity data and user data across separate logical units of the non-volatile memory in the SSD. A logical unit may be one or more die of the non-volatile memory. For example, the non-volatile memory to store data in the SSD may be divided into four logical units where data in each logical unit is kept separate from other logical units of the non-volatile memory in the SSD so that neither the user data nor parity data may be moved from one logical unit to another logical unit. Each logical unit of non-volatile memory may store user data as well as parity data. The parity data at one of the logical units of non-volatile memory may be used to reconstruct the user data at another logical unit that has failed. For example, a first logical unit may store parity data for other user data that is stored at the second, third, and fourth logical units of the same SSD. If the user data at one of the second, third, or fourth logical units is irretrievable or corrupted (e.g., through a failure or malfunction of a semiconductor die corresponding to a portion of the failed logical unit), then the user data may be reconstructed by using parity data and the other user data that is stored at the still functional logical units of the SSD. Furthermore, the reconstructed user data may be stored at the locations of each functional logical unit that stores parity data. For example, the reconstructed user data may overwrite the parity data at the other functional logical units. After the overwriting of the parity data, the die redundancy implementation to provide the die redundancy for the SSD may be disabled, but the SSD may remain functional despite a significant die failure corresponding to the failed logical unit as the user data has been reconstructed.
As such, die redundancy may be provided for a single SSD in part by parity data stored across logical units of the SSD. The die redundancy may protect against the failure of one of the logical units of the SSD. As such, the reliability of the user data stored in the non-volatile memory in the SSD may be improved despite if the SSD has been manufactured with a higher die failure rate (e.g., the SSD may remain functional when one or more die of the non-volatile memory fails). Furthermore, the controller of the SSD may not include the functionality to perform an XOR operation to provide some redundancy, and the die redundancy described herein may be used in an SSD with a controller that does not include such functionality. Instead, the die redundancy functionality may be provided by software or hardware that is external to the SSD. For example, the SSD with the controller may transmit an interrupt over an I/O interface (e.g., a Peripheral Component Interconnect Express (PCIe) interface) coupling the SSD with a host computer. The interrupt may be based on a Non-Volatile Memory Express (NVMe) specification. The NVM Express specification defines an optimized register interface, command set and feature set for PCI Express (PCIe®)-based Solid-State Drives (SSDs). For example, in an NVMe embodiment, an asynchronous notification interrupt may be sent from the SSD over the PCIe interface to interrupt a driver associated with the SSD in the host computer using a PCIe interrupt or Message Signaled Interrupts (MSI). In some embodiments, in response to receiving the interrupt, the host computer may access data stored in a log page stored in the non-volatile memory of the SSD. For example, the log page may be stored in dies of the non-volatile memory corresponding to each of the logical units. When a logical unit fails, the log page associated with the remaining functional logical units may be updated to identify that a logical unit has failed and the asynchronous notification interrupt may be issued to the host computer. In response to receiving the asynchronous notification interrupt, the host computer may access a log page stored in one of the logical units to identify whether a logical unit has failed.
Furthermore, the die redundancy based on logical units as described herein may provide for the die redundancy if the controller of the SSD does not include functionality to perform an XOR operation to provide some redundancy in cases where a die of the non-volatile memory of the SSD has failed. However, the implementation of the XOR operation in the controller of the SSD may increase the complexity in the design of the SSD controller. In order to provide redundancy within a single SSD that does not include a controller with functionality to perform an XOR operation, the SSD may be divided into logical units and the redundancy may be provided by software or hardware that is external to SSD and is thus not implemented in the controller of the SSD.
As shown in
In some embodiments, the solid-state drive 120 may be a solid-state drive (SSD) or any other such storage device. The non-volatile memory 122.1 to 122.n may include one or more chips or dies that may individually include one or more types of non-volatile memory devices. In some embodiments, the non-volatile memory devices of the non-volatile memory may be embodied as planar or three-dimensional NAND (“3D NAND”) non-volatile memory devices or NOR. However, in other embodiments, the non-volatile memory may be embodied as any combination of memory devices that use chalcogenide phase change material (e.g., chalcogenide glass), three-dimensional (3D) crosspoint memory, or other types of byte-addressable, write-in-place non-volatile memory, ferroelectric transistor random-access memory (FeTRAM), nanowire-based non-volatile memory, phase change memory (PCM), memory that incorporates memristor technology, Magnetoresistive random-access memory (MRAM), Spin Transfer Torque (STT)-MRAM, silicon-oxide-nitride-oxide-silicon (SONOS) memory, polymer memory such as ferroelectric polymer memory, ovonic memory, nanowire or electrically erasable programmable read-only memory (EEPROM), etc. As previously described, the solid-state drive 120 may be arranged or configured as a solid-state drive. However, examples described in the present disclosure are not limited to storage devices arranged or configured as SSDs.
Furthermore, the host computer 110 may include a redundancy controller 124 that provides redundancy for the solid-state drive 120 based on logical units. The redundancy controller 124 may be software, hardware (e.g., a separate integrated circuit), or a combination of software and hardware that is located externally to the solid-state drive 120. The redundancy controller 124 may provide functionality to provide redundancy associated with media failure of the solid-state drive 120 (e.g., failure of the dies of the non-volatile memory 122.1 to 122.n), but may not provide such protection for the failure of the controller 121. Further details with regard to the redundancy controller 124 are described in conjunction with
As shown in
The processing logic may further determine whether one of the logical units of the solid-state drive has failed (block 220). For example, a logical unit may be identified as having failed when one or more die of the non-volatile memory of an SSD corresponding to the logical unit has failed. The failure of a logical unit may be identified based on one or more factors that include, but are not limited to, a number or rate of error-correcting code based operations needed to recover user data of the failed logical unit, a lack of available space corresponding to the failed logical unit for tolerating any additional failures (e.g., based on portions of the die of the failed logical unit no longer being used to reliably store data), self-test mechanisms that may indicate that the one or more die of the logical unit are no longer reliable, etc. Thus, a logical unit may be considered to be a failed logical unit when the user data at the failed logical unit is no longer reliable or reliably retrieved. If the processing logic determines that one of the logical units of the SSD has not failed, then the processing logic may not overwrite any parity data stored at the other logical units of the solid-state drive (block 240). For example, the parity data stored at the remaining three logical units may still be used to provide die redundancy for an SSD. However, if the processing logic determines that one of the logical units of the solid-state drive has failed, then the processing logic may identify the locations at the other three functioning logical units of the solid-state drive that are storing parity data (block 250). For example, for each block of user data that is stored but no longer retrievable at the failed logical unit, corresponding parity data at another logical unit may be identified. Furthermore, the processing logic may reconstruct (i.e., generate) or receive reconstructed user data of the failed logical unit based on the parity data that is stored at the remaining three functioning logical units (block 260). For example, a combination of the parity data and user data at the remaining functional logical units may generate the reconstructed user data. In some embodiments, the combination may be based on an XOR operation with the parity data and the corresponding user data at the three remaining functional logical units. The XOR operation may correspond to a type of RAID 5 type algorithm or another such RAID type algorithm to reconstruct user data. Subsequently, the reconstructed user data, or the generated user data, of the failed logical unit may replace or overwrite the parity data that was stored at the three remaining functional logical units (block 270). As such, after the failure of a logical unit, the parity data can be overwritten with the reconstructed user data at the other logical units.
In some embodiments, after the overwriting of the parity data with the reconstructed user data, a logical address associated with the user data is updated from the previous physical address corresponding to locations at the failed logical unit of the SSD to the physical address associated with the overwritten parity data of the functional logical unit.
As shown in
As shown in
As shown in
As such, at a first time, four functional logical units may store user data and corresponding parity data. At a subsequent time, one of the functional logical units may no longer be identified as functioning. In response to such an identification, the user data at the failed logical unit may be reconstructed and replace the parity data stored at the remaining logical units. For example, the reconstructed user data may be generated based on parity data stored at another logical unit and additional user data that is stored at an additional block address or additional location of an additional logical unit. Thus, the reconstructed user data may be based on a combination of parity data from one of the logical units and user data from the other logical units. No parity data may be stored at the remaining logical units once the reconstructed data has been stored in response to the failure of one of the logical units.
As shown in
Referring to
As such, a first logical unit failure may not be reported to a user of a computer system that uses an SSD. The SSD may be associated with a lower reliability specification so that the failure of one logical unit is expected during the lifetime operation of the SSD and is mitigated by the die redundancy associated with overwriting parity data with reconstructed user data as previously described. However, in an embodiment with 4 logical units, if a second logical unit fails so that half of the logical units of the SSD are no longer functional, then the user of the computer system may receive a notification or a status message to inform the user of the unreliability or failure of the SSD. In some embodiments, the non-volatile memory of the SSD may be associated with another number of logical units. For example, the non-volatile memory of the SSD may correspond to fewer logical units (e.g., three) or more than four logical units (e.g., sixteen logical units).
As shown in
In some embodiments, the performance of a solid-state drive management process, such as wear leveling, defragmentation, or other such processes may involve the moving of user data within a single logical unit. For example, a wear leveling process performed on the SSD may involve the moving of user data of the first logical unit of non-volatile memory from a location within the first logical unit to another location within the same first logical unit based on a number of write operations or erase cycles that have been performed on each of the blocks of the logical unit. For example, if the block with a block address of 0 of the first logical unit 510 has exceeded a threshold amount of erase cycles, then the user data at the block address of 0 may be moved or copied to the block address of 8 that is also included within the same logical unit as opposed to being moved or copied to another block address that is in a different logical unit Similarly, a defragmentation may move user data to different block addresses within the same logic unit as opposed to moving data from one logical unit to another logical unit during the defragmentation operation. Thus, when the defragmentation operation physically organizes the user data stored at the block addresses of a logical unit based on organizing the user data into a smaller number of regions, the user data may only change to another block address within the same logical unit as opposed to another block address of another logical unit.
After the first logical unit 510 is determined to be a failed logical unit, then the user data blocks at the solid-state drive addresses 0-1 and 8-9 may be reconstructed using parity data at the other logical units 520, 530, and 540 and may be written to the location currently storing the parity data at the other logical units. Furthermore, a re-mapping may be performed so that a pointer to the user data is changed from a solid-state drive block address for the first logical unit 510 to a solid-state drive block address of a new logical unit. For example, before the failure of the first logical unit 510, user data may be stored at the solid-state drive block address 0 of the first logical unit 510 and a pointer for the user data may point to the solid-state drive block address 0. For example, the pointer may identify the block address. Thus, an application of a host computer may access a logical address corresponding to the pointer to the block address 0 to access the user data. After the failure of the first logical unit 510, the user data at the solid-state drive block address 0 may be reconstructed and written to the parity data block 0 that is stored at the solid-state drive block address 4. In response to such a write operation, the pointer for the user data may be updated or changed from pointing to the solid-state drive block address 0 in the first logical unit to the solid-state drive block address 4 in the third logical unit 530. For example, the pointer for the logical address may be changed to point to the block address 4. Thus, when the application subsequently requests the user data corresponding to the same logical address, the changed or updated pointer may be used to retrieve the user data at the block address of 4 instead of the block address of 0.
Although a particular layout of user data and parity data is illustrated in the table 500, any combination or layout of user data and parity data with respect to the solid-state drive and logical units may be used.
As shown in
The memory buffer 612 may be implemented using a volatile static random access memory (SRAM), or any other volatile memory, for at least temporarily storing digital information (e.g., the data, computer-executable instructions, applications, etc.) as well as context information for the solid-state drive 602. Further, the processing device 614 may be configured to execute at least one program out of at least one memory to allow the memory arbiter 620 to direct the information from the memory buffer 612 to the solid-state memory within the non-volatile memory packages 608.1-608.n via the channels 622.1-622.n. Furthermore, via the I/O interface 605, the controller 610 may receive commands issued by the host computer 604 for writing or reading the data to and from the solid-state memory within the non-volatile memory packages 608.1-608.n.
The non-volatile memory packages 608.1-608.n may each include one or more non-volatile memory dies, in which each non-volatile memory die may include non-volatile memory (e.g., NAND flash memory) configured to store digital information or data in one or more arrays of memory cells organized into one or more pages. For example, the non-volatile memory package 608.1 may include one or more non-volatile memory dies. Each of the one or more non-volatile memory dies may be used or assigned to one logical unit so that block addresses of one logical unit are not distributed between two or more logical units.
The machine may be a personal computer (PC), a tablet PC, a set-top box (STB), a Personal Digital Assistant (PDA), a cellular telephone, a web appliance, a server, a network router, a switch or bridge, or any machine capable of executing a set of instructions (sequential or otherwise) that specify actions to be taken by that machine. Further, while a single machine is illustrated, the term “machine” shall also be taken to include any collection of machines that individually or jointly execute a set (or multiple sets) of instructions to perform any one or more of the methodologies discussed herein.
The example computer system 700 includes a processing device 702, a main memory 704 (e.g., read-only memory (ROM), flash memory, dynamic random access memory (DRAM) such as synchronous DRAM (SDRAM), etc.) a static memory 706 (e.g., flash memory, static random access memory (SRAM), etc.), and a data storage device 718, which communicate with each other via a bus 730. The data storage device 718 may correspond to the solid-state drive 120 of
Processing device 702 represents one or more general-purpose processing devices such as a microprocessor, a central processing unit, or the like. More particularly, the processing device may be complex instruction set computing (CISC) microprocessor, reduced instruction set computing (RISC) microprocessor, very long instruction word (VLIW) microprocessor, or processor implementing other instruction sets, or processors implementing a combination of instruction sets. Processing device 702 may also be one or more special-purpose processing devices such as an application specific integrated circuit (ASIC), a field programmable gate array (FPGA), a digital signal processor (DSP), network processor, or the like. The processing device 702 may be configured to execute instructions 726 for performing operations and steps discussed herein.
The computer system 700 may further include a network interface device 708 to communicate over the network 720. The computer system 700 also may include a video display unit 710 (e.g., a liquid crystal display (LCD) or a cathode ray tube (CRT)), an alphanumeric input device 712 (e.g., a keyboard), a cursor control device 714 (e.g., a mouse), a graphics processing unit 722, a signal generation device 716 (e.g., a speaker), graphics processing unit 722, video processing unit 728, and audio processing unit 732.
The data storage device 718 may include a machine-readable storage medium 724 (also known as a computer-readable medium) on which is stored one or more sets of instructions 726 or software embodying any one or more of the methodologies or functions described herein. The instructions 726 may also reside, completely or at least partially, within the main memory 704 and/or within the processing device 702 during execution thereof by the computer system 700, the main memory 704 and the processing device 702 also constituting machine-readable storage media.
In one implementation, the instructions 726 include instructions to implement functionality corresponding to redundancy controller (e.g., redundancy controller 124 of
Some portions of the preceding detailed descriptions have been presented in terms of algorithms and symbolic representations of operations on data bits within a computer memory. These algorithmic descriptions and representations are the ways used by those skilled in the data processing arts to most effectively convey the substance of their work to others skilled in the art. An algorithm is here, and generally, conceived to be a self-consistent sequence of operations leading to a desired result. The operations are those requiring physical manipulations of physical quantities. Usually, though not necessarily, these quantities take the form of electrical or magnetic signals capable of being stored, combined, compared, and otherwise manipulated. It has proven convenient at times, principally for reasons of common usage, to refer to these signals as bits, values, elements, symbols, characters, terms, numbers, or the like.
It should be borne in mind, however, that all of these and similar terms are to be associated with the appropriate physical quantities and are merely convenient labels applied to these quantities. Unless specifically stated otherwise as apparent from the above discussion, it is appreciated that throughout the description, discussions utilizing terms such as “identifying” or “determining” or “executing” or “performing” or “collecting” or “creating” or “sending” or the like, refer to the action and processes of a computer system, or similar electronic computing device, that manipulates and transforms data represented as physical (electronic) quantities within the computer system's registers and memories into other data similarly represented as physical quantities within the computer system memories or registers or other such information storage devices.
The present disclosure also relates to an apparatus for performing the operations herein. This apparatus may be specially constructed for the intended purposes, or it may comprise a general purpose computer selectively activated or reconfigured by a computer program stored in the computer. Such a computer program may be stored in a computer readable storage medium, such as, but not limited to, any type of disk including floppy disks, optical disks, CD-ROMs, and magnetic-optical disks, read-only memories (ROMs), random access memories (RAMs), EPROMs, EEPROMs, magnetic or optical cards, or any type of media suitable for storing electronic instructions, each coupled to a computer system bus.
The algorithms and displays presented herein are not inherently related to any particular computer or other apparatus. Various general purpose systems may be used with programs in accordance with the teachings herein, or it may prove convenient to construct a more specialized apparatus to perform the method. The structure for a variety of these systems will appear as set forth in the description below. In addition, the present disclosure is not described with reference to any particular programming language. It will be appreciated that a variety of programming languages may be used to implement the teachings of the disclosure as described herein.
The present disclosure may be provided as a computer program product, or software, that may include a machine-readable medium having stored thereon instructions, which may be used to program a computer system (or other electronic devices) to perform a process according to the present disclosure. A machine-readable medium includes any mechanism for storing information in a form readable by a machine (e.g., a computer). For example, a machine-readable (e.g., computer-readable) medium includes a machine (e.g., a computer) readable storage medium such as a read only memory (“ROM”), random access memory (“RAM”), magnetic disk storage media, optical storage media, flash memory devices, etc.
In the foregoing disclosure, implementations of the disclosure have been described with reference to specific example implementations thereof. It will be evident that various modifications may be made thereto without departing from the broader spirit and scope of implementations of the disclosure as set forth in the following claims. The disclosure and drawings are, accordingly, to be regarded in an illustrative sense rather than a restrictive sense.
The following examples pertain to further embodiments.
Example 1 is a system comprising an interface operatively coupled to a solid-state drive and a processing device, coupled to the solid-state drive via the interface, to receive an indication of a failure of a logical unit of a non-volatile memory of the solid-state drive, identify, response to the indication of the failure, parity data at locations of other logical units of the non-volatile memory of the solid-state drive, reconstruct user data from the logical unit based on the parity data from the locations of the other logical units of the non-volatile memory of the solid-state drive, and store the reconstructed user data from the logical unit at the locations of the other logical units that store the parity data.
In Example 2, in the system of Example 1, to reconstruct the user data from the logical unit, the processing device is further to generate at least a portion of the reconstructed user data based on a combination of one of the parity data from one of the locations of the other logical units and user data from an additional location of an additional logical unit.
In Example 3, in the system of any of Examples 1-2, to store the reconstructed data at the locations of the other logical units that store the parity data, the processing device is further to overwrite the parity data at the locations of the other logical units with the reconstructed user data, wherein the parity data is no longer stored after the overwriting of the parity data with the reconstructed user data.
In Example 4, in the system of any of Examples 1-3, the processing device is further to provide, in response to the indication of the failure of the logical unit, a notification of a disabling of a die redundancy operation of the solid-state drive.
In Example 5, in the system of any of Examples 1-4, the processing device is further to receive, after receiving the indication of the failure of the logical unit, another indication of a failure of a second logical unit of the non-volatile memory of the solid-state drive and to provide, in response to receiving the another indication of the failure of the second logical unit, an indication of a failure of the solid-state drive.
In Example 6, in the system of any of Examples 1-5, the processing device is further to receive a request to perform a wear leveling operation or a defragmentation operation for the non-volatile memory in response to the request, move data between block addresses of the non-volatile memory that are included in a single logical unit.
In Example 7, in the system of any of Examples 1-6, the logical unit stores a first user data corresponding to a first stripe associated with the non-volatile memory and a second user data corresponding to a second stripe associated with the non-volatile memory, wherein to reconstruct the user data, the processing device is further to identify a location of a first parity data for the first stripe stored at a first functional logical unit of the other logical units and identify a location of a second parity data for the second stripe stored at a second functional logical unit of the other logical units, reconstruct the first user data based on a combination of the first parity data stored at the first functional logical unit and first user data of the first stripe stored at the second functional logical unit, and reconstruct the second user data based on a combination of the second parity data stored at the first functional logical unit and second user data of the second stripe stored at the first functional logical unit.
In Example 8, in the system of any of Examples 1-7, wherein to store the reconstructed data at the locations of the other logical units that store the parity data, the processing device is further to overwrite the first parity data of the first stripe that is stored at the first functional logical unit with the reconstructed first user data overwrite the second parity data of the second stripe that is stored at the second functional logical unit with the reconstructed second user data.
In Example 9, in the system of any of Examples 1-8, the failure of the logical unit is based on a failure of a die of the non-volatile memory.
In Example 10, in the system of any of Examples 1-9, the reconstructing of the user data is based on an exclusive-or (XOR) operation using the parity data.
In Example 11, an apparatus comprises a redundancy controller that is external to a solid state drive (SSD). The redundancy controller is to identify a plurality of logical units associated with a plurality of non-volatile memory devices of the SSD where the plurality of non-volatile memory devices are associated with a plurality of block addresses of the SSD and where each of the plurality of logical units corresponds to a separate portion of the plurality of block addresses associated with the plurality of non-volatile memory devices. The redundancy controller is further to determine whether a first logical unit of the plurality of logical units is associated with a failure, identify, in response to determining that the first logical unit is associated with the failure, parity data at a block address of a second logical unit of the plurality of logical units, generate user data corresponding to one of the block addresses of the first logical unit based on the parity data at the block address of the second logical unit, and store the generated user data at the block address of the second logical unit.
In Example 12, in the apparatus of Example 11, to generate the user data corresponding to the one of the block addresses of the first logical unit, the redundancy controller is further to identify a block address of a third logical unit of the plurality of logical units that stores user data that corresponds to the parity data at the block address of the second logical unit where the generating of the user data corresponding to one of the block addresses of the first logical unit is further based on the user data of the block address of the third logical unit.
In Example 13, in the apparatus of any of Examples 11-12, to store the generated user data at the block address of the second logical unit, the redundancy controller is further to overwrite the parity data at the block address of the second logical unit with the generated user data where the parity data is no longer stored after the overwriting of the parity data with the generated user data.
In Example 14, in the apparatus of any of Examples 11-13, the redundancy controller is further to determine that another logical unit of the plurality of logical units is associated with a failure after determining that the first logical unit is associated with the failure and provide, in response to determining that the another logical unit is associated with the failure, an indication of a failure of the SSD.
In Example 15, in the apparatus of any of Examples 11-14, the failure associated with the first logical unit is based on a failure of a die of the non-volatile memory devices of the SSD.
In Example 16, in the apparatus of any of Examples 11-15, the generating of the user data is based on an exclusive-or (XOR) operation between the parity data at the block address of the second logical unit and additional user data stored at an additional block address of a third logical unit of the plurality of logical units.
In Example 17, a method comprises receiving an indication of a failure of a logical unit of a non-volatile memory of a solid-state drive, in response to the indication of the failure, identifying parity data at locations of other logical units of a non-volatile memory of the solid-state drive, reconstructing user data from the logical unit based on the parity data from the locations of the other logical units of the non-volatile memory of the solid-state drive, and storing, by a processing device, the reconstructed user data from the logical unit at the locations of the other logical units that store the parity data.
In Example 18, in the method of Example 17, the reconstructing of the user data from the logical unit comprises generating at least a portion of the reconstructed user data based on a combination of one of the parity data from one of the locations of the other logical units and user data from an additional location of an additional logical unit.
In Example 19, in the method of any of Examples 17-18, storing the reconstructed user data at the locations of the other logical units that store the parity data comprises overwriting the parity data at the locations of the other logical units with the reconstructed data, wherein the parity data is no longer stored after the overwriting of the parity data with the reconstructed user data.
In Example 20, in the method of any of Examples 17-19, the method further comprises in response to the indication of the failure of the logical unit, providing a notification of a disabling of a die redundancy operation of the solid-state drive.
In Example 21, in the method of any of Examples 17-20, the method further comprises receiving, after receiving the indication of the failure of the logical unit, another indication of a failure of a second logical unit of the non-volatile memory of the solid-state drive and in response to receiving the another indication of the failure of the second logical unit, providing an indication of a failure of the solid-state drive.
In Example 22, in the method of any of Examples 17-21, the logical unit stores a first user data corresponding to a first stripe associated with the non-volatile memory and a second user data corresponding to a second stripe associated with the non-volatile memory, wherein the reconstructing of the user data comprises identifying a location of a first parity data for the first stripe stored at a first functional logical unit of the other logical units and identify a location of a second parity data for the second stripe stored at a second functional logical unit of the other logical units, reconstructing the first user data based on a combination of the first parity data stored at the first functional logical unit and first user data of the first stripe stored at the second functional logical unit, and reconstructing the second user data based on a combination of the second parity data stored at the first functional logical unit and second user data of the second stripe stored at the first functional logical unit.
In Example 23, in the method of any of Examples 17-22, storing the reconstructed data at the locations of the other logical units that store the parity data comprises overwriting the first parity data of the first stripe that is stored at the first functional logical unit with the reconstructed first user data and overwriting the second parity data of the second stripe that is stored at the second functional logical unit with the reconstructed second user data.
In Example 24, in the method of any of Examples 17-23, the reconstructing of the user data is based on an exclusive-or (XOR) operation using the parity data.
In Example 25, a system comprising a system on a chip (SOC) comprises a plurality of functional units and a redundancy controller, coupled to the functional units, to receive an indication of a failure of a logical unit of a non-volatile memory of the solid-state drive, in response to the indication of the failure, identify parity data at locations of other logical units of the non-volatile memory of the solid-state drive, reconstruct user data from the logical unit based on the parity data from the locations of the other logical units of the non-volatile memory of the solid-state drive and store the reconstructed user data from the logical unit at the locations of the other logical units that store the parity data.
In Example 26, the SOC of Example 25 further comprises the subject matter of Examples 2-10.
In Example 27, in the SOC of Example 25, the redundancy controller is further operable to perform the subject matter of Examples 17-24.
In Example 28, the SOC of Example 25 further comprises the subject matter of Examples 11-16.
Example 29 is an apparatus comprises a means for receiving an indication of a failure of a logical unit of a non-volatile memory of the solid-state drive, means for identifying, in response to the indication of the failure, parity data at locations of other logical units of the non-volatile memory of the solid-state drive, means for reconstructing user data from the logical unit based on the parity data from the locations of the other logical units of the storage memory of the solid-state drive, and means for storing the reconstructed user data from the logical unit at the locations of the other logical units that store the parity data.
In Example 30, the apparatus of Example 29 further comprises the subject matter of any of Examples 1-10 and 11-16.
Example 31 is an apparatus comprising a memory and a processor coupled to the memory and comprising a redundancy controller that is configured to perform the method of any of Examples 17-24.
In Example 32, the apparatus of Example 31 further comprises the subject matter of any of Examples 1-16.
While the present disclosure has been described with respect to a limited number of embodiments, those skilled in the art will appreciate numerous modifications and variations therefrom. It is intended that the appended claims cover all such modifications and variations as fall within the true spirit and scope of this present disclosure.