The present disclosure generally relates to memory devices, memory device operations, and, for example, to encoding metadata information in a codeword.
Memory devices are widely used to store information in various electronic devices. A memory device includes memory cells. A memory cell is an electronic circuit capable of being programmed to a data state of two or more data states. For example, a memory cell may be programmed to a data state that represents a single binary value, often denoted by a binary “1” or a binary “0.” As another example, a memory cell may be programmed to a data state that represents a fractional value (e.g., 0.5, 1.5, or the like). To store information, an electronic device may write to, or program, a set of memory cells. To access the stored information, the electronic device may read, or sense, the stored state from the set of memory cells.
Various types of memory devices exist, including random access memory (RAM), read only memory (ROM), dynamic RAM (DRAM), static RAM (SRAM), synchronous dynamic RAM (SDRAM), ferroelectric RAM (FeRAM), magnetic RAM (MRAM), resistive RAM (RRAM), holographic RAM (HRAM), flash memory (e.g., NAND memory and NOR memory), and others. A memory device may be volatile or non-volatile. Non-volatile memory (e.g., flash memory) can store data for extended periods of time even in the absence of an external power source. Volatile memory (e.g., DRAM) may lose stored data over time unless the volatile memory is refreshed by a power source. In some examples, a memory device may be associated with a compute express link (CXL). For example, the memory device may be a CXL compliant memory device and/or may include a CXL interface.
Host data or similar data may be stored in memory using multiple dies or similar components, such as by striping the host data across multiple data dies and one or more parity dies. The data dies may be used to store the host data, while the parity dies may be used to store parity bits used for error correction, such as for a purpose of correcting corrupted or unreadable data in the data dies. In some cases, the parity dies may be used to store bits used in connection with a chipkill protection scheme, in which data stored on a given die may be corrected in the event that an entire die of a memory becomes unusable. However, chipkill protection schemes may rely on full redundancy for error correction and/or may not permit metadata bits to be transmitted with a codeword during a read operation. In some instances, however, it may be beneficial to convey metadata information along with a codeword, such as by conveying one or more compute express link (CXL) metadata bits, a poison bit, a trusted executed environment (TEE) bit, and/or other types of metadata bits. Because traditional chipkill protection schemes rely on full redundancy and thus cannot support transmission of metadata bits, memory devices employing chipkill protection may be required to forgo transmission of metadata, resulting in decreased reliability of a memory system and/or corrupted data, leading to increased power, computing, storage, and other resource consumption for identifying and correcting memory operation errors.
Some implementations described herein enable transmission of metadata bits with a codeword, thereby resulting in improved information flow, increased reliability of memory systems, and decreased power, computing, storage, and other resource consumption otherwise required for identifying and correcting memory operation errors. In some implementations, metadata bits may be added to a codeword by shortening a data portion of a code, thereby enabling metadata to be encoded into the codeword without requiring storage of the metadata within a data portion of a memory stripe (e.g., within data dies) and/or without requiring transmission of the metadata bits in a channel. A decoder may perform parallel decoding of the codeword in order to identify a value of the metadata bits, such as by decoding the codeword using multiple hypotheses of the value of the metadata bits and/or by identifying which of the hypotheses results in a correctly decoded set of bits. As a result, metadata bits may be encoded within a codeword in a memory system, resulting in improved reliability and accuracy of data storage and transmission, reduction in data corruption incidents, enhanced system stability, enhanced data security and confidentiality, reduced latency in high-priority data processing, quick identification and isolation of corrupt data, and overall more efficient memory system operations.
The system 100 may be any electronic device configured to store data in memory. For example, the system 100 may be a computer, a mobile phone, a wired or wireless communication device, a network device, a server, a device in a data center, a device in a cloud computing environment, a vehicle (e.g., an automobile or an airplane), and/or an Internet of Things (IoT) device. The host system 105 may include a host processor 150. The host processor 150 may include one or more processors configured to execute instructions and store data in the memory system 110. For example, the host processor 150 may include a central processing unit (CPU), a graphics processing unit (GPU), a field-programmable gate array (FPGA), an application-specific integrated circuit (ASIC), and/or another type of processing component.
The memory system 110 may be any electronic device or apparatus configured to store data in memory. For example, the memory system 110 may be a hard drive, a solid-state drive (SSD), a flash memory system (e.g., a NAND flash memory system or a NOR flash memory system), a universal serial bus (USB) drive, a memory card (e.g., a secure digital (SD) card), a secondary storage device, a non-volatile memory express (NVMe) device, an embedded multimedia card (eMMC) device, a dual in-line memory module (DIMM), and/or a random-access memory (RAM) device, such as a dynamic RAM (DRAM) device or a static RAM (SRAM) device.
The memory system controller 115 may be any device configured to control operations of the memory system 110 and/or operations of the memory devices 120. For example, the memory system controller 115 may include control logic, a memory controller, a system controller, an ASIC, an FPGA, a processor, a microcontroller, and/or one or more processing components. In some implementations, the memory system controller 115 may communicate with the host system 105 and may instruct one or more memory devices 120 regarding memory operations to be performed by those one or more memory devices 120 based on one or more instructions from the host system 105. For example, the memory system controller 115 may provide instructions to a local controller 125 regarding memory operations to be performed by the local controller 125 in connection with a corresponding memory device 120.
A memory device 120 may include a local controller 125 and one or more memory arrays 130. In some implementations, a memory device 120 includes a single memory array 130. In some implementations, each memory device 120 of the memory system 110 may be implemented in a separate semiconductor package or on a separate die that includes a respective local controller 125 and a respective memory array 130 of that memory device 120. The memory system 110 may include multiple memory devices 120.
A local controller 125 may be any device configured to control memory operations of a memory device 120 within which the local controller 125 is included (e.g., and not to control memory operations of other memory devices 120). For example, the local controller 125 may include control logic, a memory controller, a system controller, an ASIC, an FPGA, a processor, a microcontroller, and/or one or more processing components. In some implementations, the local controller 125 may communicate with the memory system controller 115 and may control operations performed on a memory array 130 coupled with the local controller 125 based on one or more instructions from the memory system controller 115. As an example, the memory system controller 115 may be an SSD controller, and the local controller 125 may be a NAND controller.
A memory array 130 may include an array of memory cells configured to store data. For example, a memory array 130 may include a non-volatile memory array (e.g., a NAND memory array or a NOR memory array) or a volatile memory array (e.g., an SRAM array or a DRAM array). In some implementations, the memory system 110 may include one or more volatile memory arrays 135. A volatile memory array 135 may include an SRAM array and/or a DRAM array, among other examples. The one or more volatile memory arrays 135 may be included in the memory system controller 115, in one or more memory devices 120, and/or in both the memory system controller 115 and one or more memory devices 120. In some implementations, the memory system 110 may include both non-volatile memory capable of maintaining stored data after the memory system 110 is powered off and volatile memory (e.g., a volatile memory array 135) that requires power to maintain stored data and that loses stored data after the memory system 110 is powered off. For example, a volatile memory array 135 may cache data read from or to be written to non-volatile memory, and/or may cache instructions to be executed by a controller of the memory system 110.
The host interface 140 enables communication between the host system 105 (e.g., the host processor 150) and the memory system 110 (e.g., the memory system controller 115). The host interface 140 may include, for example, a Small Computer System Interface (SCSI), a Serial-Attached SCSI (SAS), a Serial Advanced Technology Attachment (SATA) interface, a Peripheral Component Interconnect Express (PCIe) interface, an NVMe interface, a USB interface, a Universal Flash Storage (UFS) interface, an eMMC interface, a double data rate (DDR) interface, and/or a DIMM interface.
The memory interface 145 enables communication between the memory system 110 and the memory device 120. The memory interface 145 may include a non-volatile memory interface (e.g., for communicating with non-volatile memory), such as a NAND interface or a NOR interface. Additionally, or alternatively, the memory interface 145 may include a volatile memory interface (e.g., for communicating with volatile memory), such as a DDR interface.
In some examples, the memory system 110 may be a compute express link (CXL) compliant memory system (sometimes referred to herein simply as a CXL memory system) and/or one or more of the memory devices 120 may be CXL compliant memory devices (sometimes referred to herein simply as a CXL memory device). CXL is a high-speed CPU-to-device and CPU-to-memory interconnect designed to accelerate next-generation performance. CXL technology maintains memory coherency between the CPU memory space and memory on attached devices, which allows resource sharing for higher performance, reduced software stack complexity, and lower overall system cost. CXL is designed to be an industry open standard interface for high-speed communications. CXL technology is built on the PCIe infrastructure, leveraging PCIe physical and electrical interfaces to provide an advanced protocol in areas such as input/output (I/O) protocol, memory protocol, and coherency interface.
In some examples, the memory system 110 may include a PCIe/CXL interface (e.g., the host interface 140 may be associated with a PCIe/CXL interface), which may be a physical interface configured to connect the CXL memory system and/or the CXL memory device to CXL compliant host devices. In such examples, the PCIe/CXL interface may comply with CXL standard specifications for physical connectivity, ensuring broad compatibility and ease of integration into existing systems using the CXL protocol. Additionally, or alternatively, a CXL memory system and/or a CXL memory device may be designed to efficiently interface with computing systems (e.g., the host system 105) by leveraging the CXL protocol. For example, a CXL memory system and/or a CXL memory device may be configured to utilize high-speed, low-latency interconnect capabilities of CXL, such as for a purpose of making the CXL memory system and/or the CXL memory device suitable for high-performance computing, data center applications, artificial intelligence (AI) applications, and/or similar applications.
A CXL memory system and/or a CXL memory device may include a CXL memory controller (e.g., memory system controller 115 and/or local controller 125), which may be configured to manage data flow between memory arrays (e.g., volatile memory arrays 135 and/or memory arrays 130) and a CXL interface (e.g., a PCIe/CXL interface, such as host interface 140). In some examples, the CXL memory controller may be configured to handle one or more CXL protocol layers, such as an I/O layer (e.g., a layer associated with a CXL.io protocol, which may be used for purposes such as device discovery, configuration, initialization, I/O virtualization, direct memory access (DMA) using non-coherent load-store semantics, and/or similar purposes); a cache coherency layer (e.g., a layer associated with a CXL.cache protocol, which may be used for purposes such as caching host memory using a modified, exclusive, shared, invalid (MESI) coherence protocol, or similar purposes); or a memory protocol layer (e.g., a layer associated with a CXL.memory (sometimes referred to as CXL.mem) protocol, which may enable a CXL memory device to expose host-managed device memory (HDM) to permit a host device to manage and access memory similar to a native DDR connected to the host); among other examples.
A CXL memory system and/or a CXL memory device may further include and/or be associated with one or more high-bandwidth memory modules (HBMMs) or similar memory arrays (e.g., volatile memory arrays 135 and/or memory arrays 130). For example, a CXL memory system and/or a CXL memory device may include multiple layers of DRAM (e.g., stacked and/or interconnected through advanced through-silicon via (TSV) technology) in order to maximize storage density and/or enhance data transfer speeds between memory layers. Additionally, or alternatively, a CXL memory system and/or a CXL memory device may include a power management unit, which may be configured to regulate power consumption associated with the CXL memory system and/or the CXL memory device and/or which may be configured to improve energy efficiency for the CXL memory system and/or the CXL memory device. Additionally, or alternatively, a CXL memory system and/or a CXL memory device may include additional components, such as one or more error correction code (ECC) engines, such as for a purpose of detecting and/or correcting data errors to ensure data integrity and/or improve the overall reliability of the CXL memory system and/or the CXL memory device.
Although the example memory system 110 described above includes a memory system controller 115, in some implementations, the memory system 110 does not include a memory system controller 115. For example, an external controller (e.g., included in the host system 105) and/or one or more local controllers 125 included in one or more corresponding memory devices 120 may perform the operations described herein as being performed by the memory system controller 115. Furthermore, as used herein, a “controller” may refer to the memory system controller 115, a local controller 125, or an external controller. In some implementations, a set of operations described herein as being performed by a controller may be performed by a single controller. For example, the entire set of operations may be performed by a single memory system controller 115, a single local controller 125, or a single external controller. Alternatively, a set of operations described herein as being performed by a controller may be performed by more than one controller. For example, a first subset of the operations may be performed by the memory system controller 115 and a second subset of the operations may be performed by a local controller 125. Furthermore, the term “memory apparatus” may refer to the memory system 110 or a memory device 120, depending on the context.
A controller (e.g., the memory system controller 115, a local controller 125, or an external controller) may control operations performed on memory (e.g., a memory array 130), such as by executing one or more instructions. For example, the memory system 110 and/or a memory device 120 may store one or more instructions in memory as firmware, and the controller may execute those one or more instructions. Additionally, or alternatively, the controller may receive one or more instructions from the host system 105 and/or from the memory system controller 115, and may execute those one or more instructions. In some implementations, a non-transitory computer-readable medium (e.g., volatile memory and/or non-volatile memory) may store a set of instructions (e.g., one or more instructions or code) for execution by the controller. The controller may execute the set of instructions to perform one or more operations or methods described herein. In some implementations, execution of the set of instructions, by the controller, causes the controller, the memory system 110, and/or a memory device 120 to perform one or more operations or methods described herein. In some implementations, hardwired circuitry is used instead of or in combination with the one or more instructions to perform one or more operations or methods described herein. Additionally, or alternatively, the controller may be configured to perform one or more operations or methods described herein. An instruction is sometimes called a “command.”
For example, the controller (e.g., the memory system controller 115, a local controller 125, or an external controller) may transmit signals to and/or receive signals from memory (e.g., one or more memory arrays 130) based on the one or more instructions, such as to transfer data to (e.g., write or program), to transfer data from (e.g., read), to erase, and/or to refresh all or a portion of the memory (e.g., one or more memory cells, pages, sub-blocks, blocks, or planes of the memory). Additionally, or alternatively, the controller may be configured to control access to the memory and/or to provide a translation layer between the host system 105 and the memory (e.g., for mapping logical addresses to physical addresses of a memory array 130). In some implementations, the controller may translate a host interface command (e.g., a command received from the host system 105) into a memory interface command (e.g., a command for performing an operation on a memory array 130).
In some implementations, one or more systems, devices, apparatuses, components, and/or controllers of
In some implementations, one or more systems, devices, apparatuses, components, and/or controllers of
In some implementations, one or more systems, devices, apparatuses, components, and/or controllers of
The number and arrangement of components shown in
As shown in
The memory stripe may be associated with multiple dies of memory used to store data bits and/or parity bits. Put another way, in some examples multiple data bits and/or parity bits may be striped across multiple dies associated with the memory stripe. For example, the memory stripe shown in
In some examples, the parity dies may store information that can be used in connection with an ECC to correct data, such as in an event in which an entire die fails (sometimes referred to as a chipkill protection). Put another way, an error correction system associated with the memory stripe may be able to correct errors due to an entire die failure. For example, as indicated by reference number 216, in some events an entire die of DRAM stack may fail (e.g., in the depicted example, Die 3 fails). In such cases, the parity bits stored in the parity dies may be encoded in such a way that the parity bits may be used to recover data that is stored on the failed die.
More particularly,
Thus, for the 8-bit symbol example shown in
symbols (e.g., 8 bytes), which is equivalent to an amount of data stored on one die. In this regard, the 8-bit RS code may be used to provide chipkill protection in an event in which an entire die of the memory stripe fails.
Similarly, as shown in
symbols, or 8 bytes), which is equivalent to an amount of data stored on one die. In this regard, the 16-bit RS code may also be used to provide chipkill protection in an event in which an entire die of the memory stripe fails.
In some other examples, a non-binary Hamming code may be used to provide chipkill protection for a memory, such as the 40-bit channel memory described above in connection with
and in which K=N−r. In some examples, non-binary Hamming codes may be considered “perfect codes” in that non-binary Hamming codes are capable of providing a most efficient error correction for a given set of parameters (e.g., a non-binary Hamming code may correct errors within a certain radius without wasting any space on unnecessary redundancy). More particularly, non-binary Hamming codes may be perfect codes with a minimum distance of three, in which a set of Hamming spheres of radius 1 centered in the codeword is a partition of an entire space of all possible patterns of N symbols in the alphabet GF(q), such that all space available to correct an error is exploited. In some examples, a primitive non-binary Hamming code may be completely described by its parity check matrix H. In such examples, a column of H may be all the possible vectors of r symbols that are linearly independent of each other. That is,
in which α corresponds to a primitive element (e.g., an element that can generate all other non-zero elements of the finite field (e.g., GF(q)) through its powers).
As shown in
Similarly, as shown in
In some examples, chipkill protection schemes (such as the chipkill protection schemes described above in connection with
Some implementations described herein enable transmission of metadata bits with a codeword, thereby resulting in improved information flow and thus increased reliability of memory systems and thus decreased power, computing, storage, and other resource consumption otherwise required for identifying and correcting memory operation errors. In some implementations, metadata bits may be added to a codeword by shortening a data portion of a code, thereby enabling metadata to be encoded into the codeword without requiring storage of the metadata within a data portion of the memory stripe and/or without requiring transmission of the metadata bits in the channel. A decoder may perform parallel decoding of the codeword in order to identify a value of the metadata bits, such as by decoding the codeword using multiple hypotheses of the value of the metadata bits and/or by identifying which of the hypotheses results in a correctly decoded syndrome. As a result, metadata bits may be encoded within a codeword in a memory system, resulting in improved reliability and accuracy of data storage and transmission, reduction in data corruption incidents, enhanced system stability, enhanced data security and confidentiality, reduced latency in high-priority data processing, quick identification and isolation of corrupt data, and overall more efficient memory system operations.
As indicated above,
As shown in
As indicated by reference number 304, in some implementations a length of a data vector (e.g., d) may be shortened, such as for a purpose of reducing a length of a relevant portion of a parity matrix (e.g., P) associated with the codeword. Shortening may include zeroing certain positions in the data vector (sometimes referred to herein as shortened positions within the data vector and/or a shortened portion of the data vector) before encoding the data vector. For example, as shown in connection with reference number 304, the data vector (e.g., d) may include a data portion (represented as D and indicated by reference number 306), which may include data, and a shortened portion (indicated by reference number 308), which may be a portion of the data vector in which all positions are set to zero. In this regard, because the shortened portion of the data vector is set to zero, only a portion of the parity matrix (e.g., P) is relevant for purposes of encoding the parity vector (e.g., p), as indicated by reference number 310. Put another way, only a portion of the parity matrix that is multiplied by the data portion (e.g., D) of the parity matrix is relevant for purposes of encoding the parity vector (e.g., p), because the remaining portion of the parity matrix will be multiplied by zero.
In some implementations, a shortened code (e.g., a shortened data portion, D, of a data vector, d) may be exploited in order to convey additional bits to a decoder, such as metadata bits or similar bits, without actually transmitting the metadata bits in the channel. For example, in a chipkill protection scheme implementing 8-bit symbol non-binary Hamming codes, a length of a primitive code (e.g., a total number of symbol positions in the original, unmodified non-binary Hamming code) may be
In implementations in which r=2,
and, because q is equal to 256 (e.g., 28), N=257. Moreover, the primitive dimension (e.g., the number of information symbols in the primitive non-binary Hamming code) may be K=N−r, which is equal to 257−2 or 255. Similarly, in a chipkill protection scheme implementing 4-bit symbol non-binary Hamming codes, a length of a primitive code (e.g., a total number of symbol positions in the original, unmodified non-binary Hamming code) may be 17 (e.g., N=q+1=16+1), and a primitive dimension may be K=N−r, which is equal to 17−2 or 15. As described above in connection with
This may be more readily understood with reference to
For certain error correction codes and/or memory stripes (e.g., the 4-bit or 8-bit non-binary Hamming codes used in connection with 40-bit memory stripes, as described above), there is at least one additional symbol of the data vector, d, that may be used for a purpose of transmitting metadata information. Put another way, in some implementations the parity vector (e.g., p) may be capable of providing error correction for up to a first quantity of bits (e.g., K), and a data vector (e.g., d) associated with a codeword may be associated with a second quantity of data bits (e.g., k) and/or a third quantity of zero bits (e.g., K−k) such that the second quantity is less than the first quantity (e.g., k<K). In such implementations, such as the example shown in connection with reference number 314, an additional bit (e.g., a metadata bit) may be encoded into the codeword. More particularly, as indicated by reference number 316, in this example a ninth symbol (indexed as symbol 8, which is shown using stippling in
In this regard, a parity vector, computed as dP, may be equal to DP+A(1, α8) when the ninth symbol is set to A, and the parity vector may be equal to DP+B(1, α8) when the ninth symbol is set to B. Accordingly, and as indicated by reference number 317, with A=0, the parity vector (e.g., p) becomes DP, and thus the codeword transmitted in the channel (e.g., x), which is equal to (p, d) as described above, becomes (DP, D). Similarly, with B=1, the parity vector (e.g., p) becomes DP+(1, α8), and thus the codeword transmitted in the channel (e.g., x) becomes (DP+(1, α8), D).
Accordingly, a decoder may be capable of identifying the value of a metadata bit (e.g., F) by decoding the vector (p, D). More particularly, the decoder may receive the potentially corrupted vector (p, D) (e.g., the decoder may receive (p′, D′), in a similar manner as described above in connection with reference number 302 of
In connection with decoding the received codeword and/or determining the two syndromes, the decoder may determine any detected errors associated with the codeword (and, more particularly, associated with the data portion, D, of the codeword). For example, for a given syndrome and/or hypothesis, the decoder may determine that that there are no detected errors (sometimes referred to herein as an outcome of zero errors (0E)), that there is a detected correctable error (CE), and/or that there is a detected uncorrectable error (UE). Additionally, or alternatively, a correctable error (e.g., CE) may be determined to be one of an error in which the error position is in one of symbols 0 through 7 (sometimes referred to herein as D0 through D7), which is sometimes referred to herein as CE07, or else an error in which the error position is in symbol 8 (e.g., D8), which is sometimes referred to herein as CE8. In that regard, following the parallel decoding (e.g., decoding of the received vector based on the two hypotheses H0 and H1), the decoder will arrive at two sets of results, one associated with the first hypothesis (sometimes referred to herein as {0E, CE07, CE8, UE}H0) and one associated with the second hypothesis (sometimes referred to herein as {0E, CE07, CE8, UE}H1). Using the results of the decoding processes (e.g., {0E, CE07, CE8, UE}H0) and {0E, CE07, CE8, UE}H1), the decoder may detect the value of the metadata bit (e.g., F) and may correct any errors, if necessary. Put another way, using the results of the parallel decoding processes, the decoder may identify a correct hypothesis as well as correct any errors in the received codeword.
For example,
In some examples, the results of the two parallel decoding processes may not result in an indication of the correct hypothesis (e.g., an indication of the correct hypothesis), which is shown as an uncorrectable error (e.g., UE) in the decoding table. For example, if both decoding processes identify zero errors (e.g., 0E), the parallel decoding processes may be incapable of identifying a correct hypothesis. Moreover, if both decoding processes identify a correctable error in the first through eighth symbols (e.g., CE07), if both decoding processes identify a correctable error in the ninth symbol (e.g., CE8), or if both decoding processes identify an uncorrectable error (e.g., UE), the parallel decoding processes may be incapable of identifying a correct hypothesis. Furthermore, if one of the decoding processes identifies a correctable error in the ninth symbol (e.g., CE8) and the other one of the decoding processes identifies an uncorrectable error (e.g., UE), the parallel decoding processes may be incapable of identifying a correct hypothesis.
However, in some cases, the results of the two parallel decoding processes may indicate the correct hypothesis, which is shown as one of “H0” in the decoding table (meaning that the first hypothesis is the correct hypothesis) or “H1” in the decoding table (meaning that the second hypothesis is the correct hypothesis). For example, if the first decoding process (e.g., the decoding process that utilizes H0) identifies zero errors (e.g., 0E) and the second decoding process (e.g., the decoding process that utilizes H1) identifies a correctable error (e.g., one of CE07 or CE8) or an uncorrectable error (e.g., UE), the parallel decoding processes may indicate that the first hypothesis (e.g., H0) is the correct hypothesis. Similarly, if the second decoding process identifies zero errors (e.g., 0E) and the first decoding process identifies a correctable error (e.g., one of CE07 or CE8) or an uncorrectable error (e.g., UE), the parallel decoding processes may indicate that the second hypothesis (e.g., H1) is the correct hypothesis. Moreover, if the first decoding process identifies a correctable error in one of the first eight data vector symbols (e.g., CE07) and the second decoding process identifies one of a correctable error in the ninth data vector symbol (e.g., CE8) or an uncorrectable error (e.g., UE), the parallel decoding processes may indicate that the first hypothesis (e.g., H0) is the correct hypothesis. Similarly, if the second decoding process identifies a correctable error in one of the first eight data vector symbols (e.g., CE07) and the first decoding process identifies one of a correctable error in the ninth data vector symbol (e.g., CE8) or an uncorrectable error (e.g., UE), the parallel decoding processes may indicate that the second hypothesis (e.g., H1) is the correct hypothesis.
The parity vector (e.g., p) and the shortened data vector (e.g., D) may be conveyed to one or more decoding components via a channel 328. For example, in examples associated with the 40-bit memory described above in connection with
Upon receiving the codeword, the first decoder 330 may decode the codeword by using the first hypothesis (H0) (e.g., by assuming that one or more metadata bits are a first value, such as F=0), arriving at a first syndrome (SH0) (e.g., a first decoded set of bits), a first error position (iH0) (e.g., i ∈[0,7] corresponding to CE07 or i=8 corresponding to CE8), and/or a first error value (aH0). Similarly, the second decoder 332 may decode the codeword by using the second hypothesis (H1) (e.g., by assuming that one or more metadata bits are a second value, such as F=1), arriving at a second syndrome (SH1) (e.g., a second decoded set of bits), a second error position (iH1), and/or a second error value (aH1). Put another way, the memory system and/or the memory device may determine a position in the data vector (e.g., D) associated with a first symbol error associated with the first syndrome (e.g., SH0) and/or a second symbol error associated with the second syndrome (e.g., SH1). As indicated by reference number 334, the decoders 330, 332 and/or a memory system and/or memory device associated with the decoders 330, 332 may determine a value of the one or more metadata bits (e.g., F) based on the decoded results, such as by using the decoding table described above in connection with
Put another way, the memory system and/or the memory device may determine a value of the metadata bit by performing a first decoding procedure to determine a first syndrome (SH0) (with the first decoding procedure being based on using a first hypothesized value of the metadata bit (e.g., F=0)), performing a second decoding procedure to determine a second syndrome (SH1) (with the second decoding procedure being based on using a second hypothesized value of the metadata bit (e.g., F=1)), and selecting, using the first syndrome and the second syndrome, one of the first hypothesized value of the metadata bit or the second hypothesized value of the metadata bit as the value of the metadata bit. Additionally, or alternatively, the memory system may determine at least one of a first symbol error associated with the first syndrome (e.g., iH0 and/or aH0) or a second symbol error associated with the second syndrome (e.g., iH1 and/or aH1), and/or may select the one of the first hypothesized value of the metadata bit or the second hypothesized value of the metadata bit as the value of the metadata bit by using the at least one of the first symbol error associated with the first syndrome or the second symbol error associated with the second syndrome (e.g., by using the decoding table described above in connection with
Although two decoders 330, 332 are shown as described in connection with identifying a single metadata bit, in some other implementations additional decoders may be used. For example, in implementations in which multiple metadata bits, m, are encoded in the codeword, a quantity of hypotheses to consider will be 2m, and thus 2m decoders may be used. Moreover, once a correct syndrome and/or hypothesis is identified, the memory system and/or memory device may perform additional operations associated with the correct hypothesis and/or syndrome, such as correcting a symbol error associated with the correct syndrome and/or hypothesis.
In some implementations, decoding results from multiple ECC engines, with each engine including an encoder component (e.g., encoder 324) and/or one or more decoder components (e.g., decoder 330 and/or decoder 332), may be used for a purpose of identifying a correct hypothesis and/or a value of one or more metadata bits. For example, in some implementations a memory device and/or a memory system may select one of the first hypothesized value of the metadata bit (e.g., 0) or the second hypothesized value of the metadata bit (e.g., 1) as the value of the metadata bit (e.g., F) based on the first syndrome (e.g., SH0), the second syndrome (e.g., SH1), and syndromes determined by one or more ECC engines, of multiple error correction code engines associated with a memory stripe and/or portion of memory.
For example, as shown in
Similarly, in the example implementation 342 shown in
In such implementations, results from multiple ECC engines (e.g., Engine 0 through Engine 7, as indicated by reference number 340 in
As indicated above,
As shown in
The method 400 may include additional aspects, such as any single aspect or any combination of aspects described below and/or described in connection with one or more other methods or operations described elsewhere herein.
In a first aspect, the multiple parity bits are capable of providing error correction for up to a first quantity of bits (e.g., Kbits), wherein the multiple data bits are associated with a data portion (e.g., D) and a shortened portion (e.g., a portion of d set to zero), wherein the data portion is associated with a second quantity of bits (e.g., k), and wherein the second quantity of data bits is less than the first quantity of bits (e.g., k<K).
In a second aspect, alone or in combination with the first aspect, the method 400 includes determining at least one of a first symbol error associated with the first decoded set of bits (e.g., iH0 and/or aH0) or a second symbol error associated with the second decoded set of bits (e.g., iH1 and/or aH1), wherein determining whether the first hypothesized value of the at least one metadata bit or the second hypothesized value of the at least one metadata bit is the value of the at least one metadata bit comprises using the at least one of the first symbol error associated with the first decoded set of bits or the second symbol error associated with the second decoded set of bits.
In a third aspect, alone or in combination with one or more of the first and second aspects, the method 400 includes correcting one of the first symbol error associated with the first decoded set of bits or the second symbol error associated with the second decoded set of bits based on determining whether the first hypothesized value of the at least one metadata bit or the second hypothesized value of the at least one metadata bit is the value of the at least one metadata bit.
In a fourth aspect, alone or in combination with one or more of the first through third aspects, the method 400 includes determining a position in the multiple data bits associated with the at least one of the first symbol error associated with the first decoded set of bits (e.g., iH0) or the second symbol error associated with the second decoded set of bits (e.g., iH1).
In a fifth aspect, alone or in combination with one or more of the first through fourth aspects, the value of the metadata bit is one of 0 or 1, wherein the first hypothesized value of the metadata bit is 0, and wherein the second hypothesized value of the metadata bit is 1.
In a sixth aspect, alone or in combination with one or more of the first through fifth aspects, encoding the codeword is performed using an encoder component associated with an error correction code engine of a memory device, wherein performing the first decoding procedure is performed using a first decoder component associated with the error correction code engine of the memory device, and wherein performing the second decoding procedure is performed using a second decoder component associated with the error correction code engine of the memory device.
Although
In some implementations, a memory device includes one or more components configured to: receive a codeword that encodes a data vector, a parity vector associated with error correction of the data vector, and a metadata bit; and determine a value of the metadata bit by: performing a first decoding procedure to determine a first syndrome, wherein the first decoding procedure is based on using a first hypothesized value of the metadata bit; performing a second decoding procedure to determine a second syndrome, wherein the second decoding procedure is based on using a second hypothesized value of the metadata bit; and selecting, using the first syndrome and the second syndrome, one of the first hypothesized value of the metadata bit or the second hypothesized value of the metadata bit as the value of the metadata bit.
In some implementations, a method includes encoding a codeword that encodes multiple data bits associated with a portion of memory, multiple parity bits associated with error correction of the multiple data bits, and at least one metadata bit; performing a first decoding procedure using the codeword to determine a first decoded set of bits, wherein the first decoding procedure is based on using a first hypothesized value of the at least one metadata bit; performing a second decoding procedure using the codeword to determine a second decoded set of bits, wherein the second decoding procedure is based on using a second hypothesized value of the at least one metadata bit; and determining, using the first decoded set of bits and the second decoded set of bits, whether the first hypothesized value of the at least one metadata bit or the second hypothesized value of the at least one metadata bit is a value of the at least one metadata bit.
In some implementations, a memory device includes multiple error correction code engines, wherein each error correction code engine, of the multiple error correction code engines, includes multiple decoders, and wherein each error correction engine is configured to: receive a codeword that encodes a data vector, a parity vector associated with error correction of the data vector, and a metadata bit; and determine a value of the metadata bit by: performing, using a first decoder, of the multiple decoders, a first decoding procedure to determine a first syndrome, wherein the first decoding procedure is based on using a first hypothesized value of the metadata bit; performing, using a second decoder, of the multiple decoders, a second decoding procedure to determine a second syndrome, wherein the second decoding procedure is based on using a second hypothesized value of the metadata bit; and selecting one of the first hypothesized value of the metadata bit or the second hypothesized value of the metadata bit as the value of the metadata bit based on the first syndrome, the second syndrome, and syndromes determined by one or more other error correction code engines, of the multiple error correction code engines.
The foregoing disclosure provides illustration and description but is not intended to be exhaustive or to limit the implementations to the precise forms disclosed. Modifications and variations may be made in light of the above disclosure or may be acquired from practice of the implementations described herein.
Even though particular combinations of features are recited in the claims and/or disclosed in the specification, these combinations are not intended to limit the disclosure of implementations described herein. Many of these features may be combined in ways not specifically recited in the claims and/or disclosed in the specification. For example, the disclosure includes each dependent claim in a claim set in combination with every other individual claim in that claim set and every combination of multiple claims in that claim set. As used herein, a phrase referring to “at least one of” a list of items refers to any combination of those items, including single members. As an example, “at least one of: a, b, or c” is intended to cover a, b, c, a+b, a+c, b+c, and a+b+c, as well as any combination with multiples of the same element (e.g., a+a, a+a+a, a+a+b, a+a+c, a+b+b, a+c+c, b+b, b+b+b, b+b+c, c+c, and c+c+c, or any other ordering of a, b, and c).
When “a component” or “one or more components” (or another element, such as “a controller” or “one or more controllers”) is described or claimed (within a single claim or across multiple claims) as performing multiple operations or being configured to perform multiple operations, this language is intended to broadly cover a variety of architectures and environments. For example, unless explicitly claimed otherwise (e.g., via the use of “first component” and “second component” or other language that differentiates components in the claims), this language is intended to cover a single component performing or being configured to perform all of the operations, a group of components collectively performing or being configured to perform all of the operations, a first component performing or being configured to perform a first operation and a second component performing or being configured to perform a second operation, or any combination of components performing or being configured to perform the operations. For example, when a claim has the form “one or more components configured to: perform X; perform Y; and perform Z,” that claim should be interpreted to mean “one or more components configured to perform X; one or more (possibly different) components configured to perform Y; and one or more (also possibly different) components configured to perform Z.”
No element, act, or instruction used herein should be construed as critical or essential unless explicitly described as such. Also, as used herein, the articles “a” and “an” are intended to include one or more items and may be used interchangeably with “one or more.” Further, as used herein, the article “the” is intended to include one or more items referenced in connection with the article “the” and may be used interchangeably with “the one or more.” Where only one item is intended, the phrase “only one,” “single,” or similar language is used. Also, as used herein, the terms “has,” “have,” “having,” or the like are intended to be open-ended terms that do not limit an element that they modify (e.g., an element “having” A may also have B). Further, the phrase “based on” is intended to mean “based, at least in part, on” unless explicitly stated otherwise. As used herein, the term “multiple” can be replaced with “a plurality of” and vice versa. Also, as used herein, the term “or” is intended to be inclusive when used in a series and may be used interchangeably with “and/or,” unless explicitly stated otherwise (e.g., if used in combination with “either” or “only one of”).
This Patent Application claims priority to U.S. Provisional Patent Application No. 63/621,747, filed on Jan. 17, 2024, entitled “ENCODING METADATA INFORMATION IN A CODEWORD,” and assigned to the assignee hereof. The disclosure of the prior Application is considered part of and is incorporated by reference into this Patent Application.
Number | Date | Country | |
---|---|---|---|
63621747 | Jan 2024 | US |