TECHNIQUES FOR MEMORY ERROR CORRECTION

Information

  • Patent Application
  • 20230043306
  • Publication Number
    20230043306
  • Date Filed
    July 29, 2022
    a year ago
  • Date Published
    February 09, 2023
    a year ago
Abstract
Methods, systems, and devices for techniques for memory error correction are described. A memory system may support a refresh with error correction code (ECC) operation. The refresh with ECC operation may be indicated in a command from a host device to a memory device, or the memory device may support executing the refresh with ECC operation autonomously, for example as part of a self-refresh operation. The refresh with ECC operation may cause the memory system to, as part of a refresh operation for a row of a memory array, perform an error correction operation on at least a portion of the row. The error correction operation may correct bit errors in a set of data before an additional bit of the set of data is corrupted. The address of the portion of the row may be determined using one or more counters associated with an ECC patrol block.
Description
FIELD OF TECHNOLOGY

The following relates generally to one or more systems for memory and more specifically to techniques for memory error correction.


BACKGROUND

Memory devices are widely used to store information in various electronic devices such as computers, user devices, wireless communication devices, cameras, digital displays, and the like. Information is stored by programming memory cells within a memory device to various states. For example, binary memory cells may be programmed to one of two supported states, often denoted by a logic 1 or a logic 0. In some examples, a single memory cell may support more than two states, any one of which may be stored. To access the stored information, a component may read, or sense, at least one stored state in the memory device. To store information, a component may write, or program, the state in the memory device.


Various types of memory devices and memory cells exist, including magnetic hard disks, random access memory (RAM), read-only memory (ROM), dynamic RAM (DRAM), synchronous dynamic RAM (SDRAM), static RAM (SRAM), ferroelectric RAM (FeRAM), magnetic RAM (MRAM), resistive RAM (RRAM), flash memory, phase change memory (PCM), self-selecting memory, chalcogenide memory technologies, and others. Memory cells may be volatile or non-volatile. Non-volatile memory, e.g., FeRAM, may maintain their stored logic state for extended periods of time even in the absence of an external power source. Volatile memory devices, e.g., DRAM, may lose their stored state if disconnected from an external power source.





BRIEF DESCRIPTION OF THE DRAWINGS


FIG. 1 illustrates an example of a system that supports techniques for memory error correction in accordance with examples as disclosed herein.



FIG. 2 illustrates an example of a memory die that supports techniques for memory error correction in accordance with examples as disclosed herein.



FIG. 3 illustrates an example of a system that supports techniques for memory error correction in accordance with examples as disclosed herein.



FIG. 4 illustrates an example of a process flow that supports techniques for memory error correction in accordance with examples as disclosed herein.



FIG. 5 shows a block diagram of a memory device that supports techniques for memory error correction in accordance with examples as disclosed herein.



FIG. 6 shows a flowchart illustrating a method or methods that support techniques for memory error correction in accordance with examples as disclosed herein.





DETAILED DESCRIPTION

Data stored in a memory device (e.g., a dynamic random access memory (DRAM) device) may become corrupted over time, for example due to electromagnetic interference, high energy particles (e.g., cosmic rays), memory cell wear and aging, or other error mechanisms. Thus, stored data may in some cases come to include one or more errors, and data stored for a relatively long time may be more likely to contain multiple errors compared to data stored for a relatively short time. In some cases, a single-bit error (SBE) may be correctable, for example using a single-error correction (SEC) error correction code (ECC). However, SBEs that are not corrected may eventually become uncorrectable double-bit errors (DBEs) or other types of multi-bit errors, as after one bit within a set of data becomes corrupted, one or more additional bits within the set of data may subsequently also become corrupted. The disclosure herein may support correcting an SBE before it becomes a DBE or other type of multi-bit error. Further, though examples may be explained herein in the context of correcting SBEs before additional errors occur within a set of data subject to an error detection and correction procedure, it is to be understood that the teachings herein may further be extended to apply to detecting and correcting errors including any quantity of bits (e.g., DBEs) before they become errors including one or more additional bits.


A memory device may include an ECC block that stores parity bits for detecting errors, for example as part of an error correction operation. In some cases, the ECC block may correct errors during access operations, such as read or write operations. That is, the ECC block may perform error correction on data stored in a memory cell or group of memory cells as part of reading the data from or writing the data to the memory cell or group of memory cells. However, some portions of the memory device may not be accessed as often as other portions (i.e., some portions may be “cold”, compared to more frequently accessed “hot” portions), and so SBEs in these portions of the memory device may be more likely to turn into DBEs before such data is accessed.


As described herein, a memory device may perform error correction as part of a refresh operation to periodically perform error correction on each portion of a memory device. For example, a host device may periodically transmit a refresh command with ECC (e.g., REF_wECC) that indicates an ECC check is to be performed, where the refresh with ECC command may be different from a refresh command (i.e., different from a refresh command performed without error correction). The memory device may include an ECC patrol block that includes a counter to indicate a portion of a row (i.e., a quantity of logical columns of the row) on which to perform error correction. In response to receiving the refresh with ECC command, the memory device may activate a row and perform error correction for the portion of the row to check and correct for errors. The ECC patrol block may also increment and reset the counter, so that the ECC block may perform error correction on each portion of each row of the memory device over the course of several refresh with ECC commands. Additionally or alternatively, the memory device may operate in a self-refresh mode, and may perform refresh operations, including the refresh with ECC operation as described herein, without receiving commands from the host system. While examples of the present disclosure may be described with reference to DRAM devices, the techniques described herein may be applied to any memory type.


Features of the disclosure are initially described in the context of systems and dies as described with reference to FIGS. 1 and 2. Features of the disclosure are described in the context a system and a process flow as described with reference to FIGS. 3-4. These and other features of the disclosure are further illustrated by and described with reference to an apparatus diagram and flowcharts that relate to techniques for memory error correction as described with reference to FIGS. 5-6.



FIG. 1 illustrates an example of a system 100 that supports techniques for memory error correction in accordance with examples as disclosed herein. The system 100 may include a host device 105, a memory device 110, and a plurality of channels 115 coupling the host device 105 with the memory device 110. The system 100 may include one or more memory devices 110, but aspects of the one or more memory devices 110 may be described in the context of a single memory device (e.g., memory device 110).


The system 100 may include portions of an electronic device, such as a computing device, a mobile computing device, a wireless device, a graphics processing device, a vehicle, or other systems. For example, the system 100 may illustrate aspects of a computer, a laptop computer, a tablet computer, a smartphone, a cellular phone, a wearable device, an internet-connected device, a vehicle controller, or the like. The memory device 110 may be a component of the system operable to store data for one or more other components of the system 100.


At least portions of the system 100 may be examples of the host device 105. The host device 105 may be an example of a processor or other circuitry within a device that uses memory to execute processes, such as within a computing device, a mobile computing device, a wireless device, a graphics processing device, a computer, a laptop computer, a tablet computer, a smartphone, a cellular phone, a wearable device, an internet-connected device, a vehicle controller, a system on a chip (SoC), or some other stationary or portable electronic device, among other examples. In some examples, the host device 105 may refer to the hardware, firmware, software, or a combination thereof that implements the functions of an external memory controller 120. In some examples, the external memory controller 120 may be referred to as a host or a host device 105.


A memory device 110 may be an independent device or a component that is operable to provide physical memory addresses/space that may be used or referenced by the system 100. In some examples, a memory device 110 may be configurable to work with one or more different types of host devices. Signaling between the host device 105 and the memory device 110 may be operable to support one or more of: modulation schemes to modulate the signals, various pin configurations for communicating the signals, various form factors for physical packaging of the host device 105 and the memory device 110, clock signaling and synchronization between the host device 105 and the memory device 110, timing conventions, or other factors.


The memory device 110 may be operable to store data for the components of the host device 105. In some examples, the memory device 110 may act as a secondary-type or dependent-type device to the host device 105 (e.g., responding to and executing commands provided by the host device 105 through the external memory controller 120). Such commands may include one or more of a write command for a write operation, a read command for a read operation, a refresh command for a refresh operation, or other commands.


The host device 105 may include one or more of an external memory controller 120, a processor 125, a basic input/output system (BIOS) component 130, or other components such as one or more peripheral components or one or more input/output controllers. The components of the host device 105 may be coupled with one another using a bus 135.


The processor 125 may be operable to provide control or other functionality for at least portions of the system 100 or at least portions of the host device 105. The processor 125 may be a general-purpose processor, a digital signal processor (DSP), an application-specific integrated circuit (ASIC), a field-programmable gate array (FPGA) or other programmable logic device, discrete gate or transistor logic, discrete hardware components, or a combination of these components. In such examples, the processor 125 may be an example of a central processing unit (CPU), a graphics processing unit (GPU), a general purpose GPU (GPGPU), or an SoC, among other examples. In some examples, the external memory controller 120 may be implemented by or be a part of the processor 125.


The BIOS component 130 may be a software component that includes a BIOS operated as firmware, which may initialize and run various hardware components of the system 100 or the host device 105. The BIOS component 130 may also manage data flow between the processor 125 and the various components of the system 100 or the host device 105. The BIOS component 130 may include a program or software stored in one or more of read-only memory (ROM), flash memory, or other non-volatile memory.


The memory device 110 may include a device memory controller 155 and one or more memory dies 160 (e.g., memory chips) to support a desired capacity or a specified capacity for data storage. Each memory die 160 (e.g., memory die 160-a, memory die 160-b, memory die 160-N) may include a local memory controller 165 (e.g., local memory controller 165-a, local memory controller 165-b, local memory controller 165-N) and a memory array 170 (e.g., memory array 170-a, memory array 170-b, memory array 170-N). A memory array 170 may be a collection (e.g., one or more grids, one or more banks, one or more tiles, one or more sections) of memory cells, with each memory cell being operable to store at least one bit of data. A memory device 110 including two or more memory dies 160 may be referred to as a multi-die memory or a multi-die package or a multi-chip memory or a multi-chip package.


The device memory controller 155 may include circuits, logic, or components operable to control operation of the memory device 110. The device memory controller 155 may include the hardware, the firmware, or the instructions that enable the memory device 110 to perform various operations and may be operable to receive, transmit, or execute commands, data, or control information related to the components of the memory device 110. The device memory controller 155 may be operable to communicate with one or more of the external memory controller 120, the one or more memory dies 160, or the processor 125. In some examples, the device memory controller 155 may control operation of the memory device 110 described herein in conjunction with the local memory controller 165 of the memory die 160.


In some examples, the memory device 110 may receive data or commands or both from the host device 105. For example, the memory device 110 may receive a write command indicating that the memory device 110 is to store data for the host device 105 or a read command indicating that the memory device 110 is to provide data stored in a memory die 160 to the host device 105.


A local memory controller 165 (e.g., local to a memory die 160) may include circuits, logic, or components operable to control operation of the memory die 160. In some examples, a local memory controller 165 may be operable to communicate (e.g., receive or transmit data or commands or both) with the device memory controller 155. In some examples, a memory device 110 may not include a device memory controller 155, and a local memory controller 165 or the external memory controller 120 may perform various functions described herein. As such, a local memory controller 165 may be operable to communicate with the device memory controller 155, with other local memory controllers 165, or directly with the external memory controller 120, or the processor 125, or a combination thereof. Examples of components that may be included in the device memory controller 155 or the local memory controllers 165 or both may include receivers for receiving signals (e.g., from the external memory controller 120), transmitters for transmitting signals (e.g., to the external memory controller 120), decoders for decoding or demodulating received signals, encoders for encoding or modulating signals to be transmitted, or various other circuits or controllers operable for supporting described operations of the device memory controller 155 or local memory controller 165 or both.


The external memory controller 120 may be operable to enable communication of one or more of information, data, or commands between components of the system 100 or the host device 105 (e.g., the processor 125) and the memory device 110. The external memory controller 120 may convert or translate communications exchanged between the components of the host device 105 and the memory device 110. In some examples, the external memory controller 120 or other component of the system 100 or the host device 105, or its functions described herein, may be implemented by the processor 125. For example, the external memory controller 120 may be hardware, firmware, or software, or some combination thereof implemented by the processor 125 or other component of the system 100 or the host device 105. Although the external memory controller 120 is depicted as being external to the memory device 110, in some examples, the external memory controller 120, or its functions described herein, may be implemented by one or more components of a memory device 110 (e.g., a device memory controller 155, a local memory controller 165) or vice versa.


The components of the host device 105 may exchange information with the memory device 110 using one or more channels 115. The channels 115 may be operable to support communications between the external memory controller 120 and the memory device 110. Each channel 115 may be examples of transmission media that carry information between the host device 105 and the memory device. Each channel 115 may include one or more signal paths or transmission media (e.g., conductors) between terminals associated with the components of the system 100. A signal path may be an example of a conductive path operable to carry a signal. For example, a channel 115 may include a first terminal including one or more pins or pads at the host device 105 and one or more pins or pads at the memory device 110. A pin may be an example of a conductive input or output point of a device of the system 100, and a pin may be operable to act as part of a channel.


Channels 115 (and associated signal paths and terminals) may be dedicated to communicating one or more types of information. For example, the channels 115 may include one or more command and address (CA) channels 186, one or more clock signal (CK) channels 188, one or more data (DQ) channels 190, one or more other channels 192, or a combination thereof. In some examples, signaling may be communicated over the channels 115 using single data rate (SDR) signaling or double data rate (DDR) signaling. In SDR signaling, one modulation symbol (e.g., signal level) of a signal may be registered for each clock cycle (e.g., on a rising or falling edge of a clock signal). In DDR signaling, two modulation symbols (e.g., signal levels) of a signal may be registered for each clock cycle (e.g., on both a rising edge and a falling edge of a clock signal).


In some examples, CA channels 186 may be operable to communicate commands between the host device 105 and the memory device 110 including control information associated with the commands (e.g., address information). For example, commands carried by the CA channel 186 may include a read command with an address of the desired data. In some examples, a CA channel 186 may include any quantity of signal paths to decode one or more of address or command data (e.g., eight or nine signal paths).


In some examples, data channels 190 may be operable to communicate one or more of data or control information between the host device 105 and the memory device 110. For example, the data channels 190 may communicate information (e.g., bi-directional) to be written to the memory device 110 or information read from the memory device 110.


In some examples, the one or more other channels 192 may include one or more error detection code (EDC) channels. The EDC channels may be operable to communicate error detection signals, such as checksums, to improve system reliability. An EDC channel may include any quantity of signal paths.


The system 100 may include any quantity of non-transitory computer readable media that support techniques for memory error correction. For example, the host device 105, the device memory controller 155, or a memory device 110 may include or otherwise may access one or more non-transitory computer readable media storing instructions (e.g., firmware) for performing the functions ascribed herein to the host device 105, device memory controller 155, or memory device 110. For example, such instructions, if executed by the host device 105 (e.g., by the external memory controller 120), by the device memory controller 155, or by a memory device 110 (e.g., by a local controller 165), may cause the host device 105, device memory controller 155, or memory device 110 to perform associated functions as described herein.


In some cases, a memory die 160 may include an ECC block (i.e., an on-die ECC block) used for performing error correction operations (e.g., SEC operations) on data stored on the memory die 160. Errors, such as single bit errors (SBEs), may be introduced into the data from electromagnetic radiation, high-energy particles (e.g., from cosmic rays), memory cell wear and age, or a combination thereof, among other examples. If a set of data with a SBE develops one or more additional errors (i.e., a double bit error (DBE) or multiple bit error (MBE)), error correction operations such as SEC and SECDED may not be able to correct the DBE. Thus, it may be advantageous to correct SBEs relatively quickly, and thus mitigate the likelihood of developing DBEs. To correct SBEs, error correction operations may be performed on data during access operations (i.e., read or write operations). However, some regions of data in a memory array 170 may be accessed more frequently than other regions (i.e., some regions of data may be “hot”, while other regions may be “cold”). Because cold regions of the memory array 170 are accessed relatively infrequently, the cold regions may be more susceptible to developing DBEs.


In some cases, a memory system 100 may prevent cold regions from developing


DBEs or MBEs by periodically performing an access operation (e.g., a read operation) on each region of the memory array 170, thus preventing any region from becoming cold. However, periodically performing a read operation on each region of the memory array 170 may introduce system latency and power consumption, for example by consuming bandwidth resources used for communicating data between the host device 105 and the memory device 110. That is, the read operations used to prevent cold regions may prevent other operations from transferring data between the host device 105 and the memory device 110 during the read operation. For example, bandwidth resources consumed by read operations used to prevent cold regions may contribute to so-called page conflicts in which two immediately subsequent access operations (e.g., a read followed by a read or write, or a write followed by a read or write) may target different rows (i.e., pages) of the same bank. Such page conflicts cases lead to an increased quantity of row-switching commands and operations (e.g., activate and precharge commands and operations), which in return may degrade efficiency (e.g., overall bus efficiency).


In some examples, a memory system 100 may perform error correction as part of a refresh operation—e.g., in addition to or as an alternative to performing error correction as part of an access operation (e.g., a read or write operation). For example, the memory system 100 may be configured to support a refresh command, as well as a refresh with ECC command. The refresh command may cause a region of a memory array 170 (e.g., a row of a memory array 170) to be accessed and the read data written back to the region. Alternatively, a refresh with ECC command may cause the region of the memory array 170 to be accessed and, along with the read data being writing back to the region, may cause the ECC block to perform an error correction operation on the region or a portion of the region. In some cases, the host system 105 may be configured to issue the refresh command, the refresh with ECC command, or both, to the memory device 110. In other cases, the memory device 110 may initiate a refresh operation with ECC, for example as part of a self-refresh mode. A memory system 100 that supports the refresh with ECC command may reduce system latency and power consumption by preventing cold regions from developing multi-bit errors without consuming bandwidth resources used for communicating data between the host system 105 and the memory device 110.



FIG. 2 illustrates an example of a memory die 200 that supports techniques for memory error correction in accordance with examples as disclosed herein. The memory die 200 may be an example of the memory dies 160 described with reference to FIG. 1. In some examples, the memory die 200 may be referred to as a memory chip, a memory device, or an electronic memory apparatus. The memory die 200 may include one or more memory cells 205 that may each be programmable to store different logic states (e.g., programmed to one of a set of two or more possible states). For example, a memory cell 205 may be operable to store one bit of information at a time (e.g., a logic 0 or a logic 1). In some examples, a memory cell 205 (e.g., a multi-level memory cell) may be operable to store more than one bit of information at a time (e.g., a logic 00, logic 01, logic 10, a logic 11). In some examples, the memory cells 205 may be arranged in an array, such as a memory array 170 described with reference to FIG. 1.


A memory cell 205 may store a charge representative of the programmable states in a capacitor. DRAM architectures may include a capacitor that includes a dielectric material to store a charge representative of the programmable state. In other memory architectures, other storage devices and components are possible. For example, nonlinear dielectric materials may be employed. The memory cell 205 may include a logic storage component, such as capacitor 230, and a switching component 235. The capacitor 230 may be an example of a dielectric capacitor or a ferroelectric capacitor. A node of the capacitor 230 may be coupled with a voltage source 240, which may be the cell plate reference voltage, such as Vpl, or may be ground, such as Vss.


The memory die 200 may include one or more access lines (e.g., one or more word lines 210 and one or more digit lines 215) arranged in a pattern, such as a grid-like pattern. An access line may be a conductive line coupled with a memory cell 205 and may be used to perform access operations on the memory cell 205. In some examples, word lines 210 may be referred to as row lines. In some examples, digit lines 215 may be referred to as column lines or bit lines. References to access lines, row lines, column lines, word lines, digit lines, or bit lines, or their analogues, are interchangeable without loss of understanding or operation. Memory cells 205 may be positioned at intersections of the word lines 210 and the digit lines 215.


Operations such as reading and writing may be performed on the memory cells 205 by activating or selecting access lines such as one or more of a word line 210 or a digit line 215. By biasing a word line 210 and a digit line 215 (e.g., applying a voltage to the word line 210 or the digit line 215), a single memory cell 205 may be accessed at their intersection. The intersection of a word line 210 and a digit line 215 in either a two-dimensional or three-dimensional configuration may be referred to as an address of a memory cell 205.


Accessing the memory cells 205 may be controlled through a row decoder 220 or a column decoder 225. For example, a row decoder 220 may receive a row address from the local memory controller 260 and activate a word line 210 based on (e.g., using) the received row address. A column decoder 225 may receive a column address from the local memory controller 260 and may activate a digit line 215 based on (e.g., using) the received column address.


Selecting or deselecting the memory cell 205 may be accomplished by activating or deactivating the switching component 235 using a word line 210. The capacitor 230 may be coupled with the digit line 215 using the switching component 235. For example, the capacitor 230 may be isolated from digit line 215 if the switching component 235 is deactivated, and the capacitor 230 may be coupled with digit line 215 if the switching component 235 is activated.


The sense component 245 may be operable to detect a state (e.g., a charge) stored on the capacitor 230 of the memory cell 205 and determine a logic state of the memory cell 205 based on (e.g., using) the stored state. The sense component 245 may include one or more sense amplifiers to amplify or otherwise convert a signal resulting from accessing the memory cell 205. The sense component 245 may compare a signal detected from the memory cell 205 to a reference 250 (e.g., a reference voltage). The detected logic state of the memory cell 205 may be provided as an output of the sense component 245 (e.g., to an input/output 255), and may indicate the detected logic state to another component of a memory device that includes the memory die 200.


The local memory controller 260 may control the accessing of memory cells 205 through the various components (e.g., row decoder 220, column decoder 225, sense component 245). The local memory controller 260 may be an example of the local memory controller 165 described with reference to FIG. 1. In some examples, one or more of the row decoder 220, column decoder 225, and sense component 245 may be co-located with the local memory controller 260. The local memory controller 260 may be operable to receive one or more of commands or data from one or more different memory controllers (e.g., an external memory controller 120 associated with a host device 105, another controller associated with the memory die 200), translate the commands or the data (or both) into information that can be used by the memory die 200, perform one or more operations on the memory die 200, and communicate data from the memory die 200 to a host device 105 based on (e.g., in response to) performing the one or more operations. The local memory controller 260 may generate row signals and column address signals to activate the target word line 210 and the target digit line 215. The local memory controller 260 may also generate and control various voltages or currents used during the operation of the memory die 200. In general, the amplitude, the shape, or the duration of an applied voltage or current discussed herein may be varied and may be different for the various operations discussed in operating the memory die 200.


The local memory controller 260 may be operable to perform one or more access operations on one or more memory cells 205 of the memory die 200. Examples of access operations may include a write operation, a read operation, a refresh operation, a precharge operation, or an activate operation, among others. In some examples, access operations may be performed by or otherwise coordinated by the local memory controller 260 in response to various access commands (e.g., from a host device 105). The local memory controller 260 may be operable to perform other access operations not listed here or other operations related to the operating of the memory die 200 that are not directly related to accessing the memory cells 205.


The local memory controller 260 may be operable to perform a write operation (e.g., a programming operation) on one or more memory cells 205 of the memory die 200. During a write operation, a memory cell 205 of the memory die 200 may be programmed to store a desired logic state. The local memory controller 260 may identify a target memory cell 205 on which to perform the write operation. The local memory controller 260 may identify a target word line 210 and a target digit line 215 coupled with the target memory cell 205 (e.g., the address of the target memory cell 205). The local memory controller 260 may activate the target word line 210 and the target digit line 215 (e.g., applying a voltage to the word line 210 or digit line 215) to access the target memory cell 205. The local memory controller 260 may apply a specific signal (e.g., write pulse) to the digit line 215 during the write operation to store a specific state (e.g., charge) in the capacitor 230 of the memory cell 205. The pulse used as part of the write operation may include one or more voltage levels over a duration.


The local memory controller 260 may be operable to perform a read operation (e.g., a sense operation) on one or more memory cells 205 of the memory die 200. During a read operation, the logic state stored in a memory cell 205 of the memory die 200 may be determined. The local memory controller 260 may identify a target memory cell 205 on which to perform the read operation. The local memory controller 260 may identify a target word line 210 and a target digit line 215 coupled with the target memory cell 205 (e.g., the address of the target memory cell 205). The local memory controller 260 may activate the target word line 210 and the target digit line 215 (e.g., applying a voltage to the word line 210 or digit line 215) to access the target memory cell 205. The target memory cell 205 may transfer a signal to the sense component 245 in response to biasing the access lines. The sense component 245 may amplify the signal. The local memory controller 260 may activate the sense component 245 (e.g., latch the sense component) and thereby compare the signal received from the memory cell 205 to the reference 250. Based on (e.g., using) that comparison, the sense component 245 may determine a logic state that is stored on the memory cell 205.


In some examples, a memory die 200 may be included as part of an automotive or other system that is safety sensitive, stability sensitive, or both. Errors, such as SBEs, may be introduced into the data stored in the memory die 200 from electromagnetic radiation, high-energy particles (e.g., from cosmic rays), memory cell wear and age, or a combination thereof, among other examples. If a set of data with a SBE develops one or more additional errors, such as DBEs or MBEs, error correction operations such as SEC and SECDED may not be able to correct the errors. Thus, it may be advantageous to correct SBEs relatively quickly, and thus mitigate the likelihood of developing DBEs.


In some cases, the memory die 200 may include an ECC block 275 (e.g., an on-die ECC) to perform error correction operations on data stored in the memory die 200, which may include error detection operations or capabilities. The ECC block 275 may perform error correction operations on data during access operations (i.e., read or write operations). However, some regions of data in the memory die 200 may be accessed more frequently than other regions (i.e., some regions of data may be “hot”, while other regions may be “cold”). Because cold regions of the memory die 200 are accessed relatively infrequently, the cold regions may be more susceptible to developing DBEs or MBEs.


In some examples, the memory die 200 may perform error correction as part of a refresh operation—e.g., in the alternative or in addition to performing error correction as part of an access operation (e.g., a read or write operation). For example, the memory die 200 may be configured to support a refresh command, as well as a refresh with ECC command. Additionally or alternatively, the memory die 200 may support performing the refresh operation with ECC as part of a self-refresh mode. The refresh operation may cause a region of the memory die 200 (e.g., a row of memory cells 205) to be accessed and written back to the region. Alternatively, a refresh operation with ECC may cause the region of the memory array to be accessed and subsequently may cause the ECC block 275 to perform an error correction operation on the region or a portion of the region. A memory die 200 that supports the refresh with ECC operation may reduce system latency and power consumption by preventing cold regions from developing without consuming bandwidth resources used for communicating data between a host system and a memory device.


To perform error correction on a set of data, the ECC block 275 may be configured to generate, using a code or algorithm, a first set of one or more parity bits associated with the set of data. The first parity bits may be compared with a second set of parity bits which were generated, for example as part of or otherwise in connection with previously writing the set of data, using the same code or algorithm. If no errors have been introduced in the set of data, then the first parity bits and the second parity bits may match. Thus, the ECC block 275 may be configured to determine whether the set of data contains a data error by comparing the first parity bits with the second parity bits. In some cases, the ECC block 275 may be configured to correct SBEs detected during the error correction procedure, though ECC schemes supporting detection or correction of other quantities of errors in a set of data may alternatively be implemented by ECC block 275.



FIG. 3 illustrates an example of a system 300 that supports techniques for memory error correction in accordance with examples as disclosed herein. The system 300 may include a column decoder 225-a, a row decoder 220-a, an input/output 255-a, a sense component 245-a, and an ECC block 275-a, which may be examples of the corresponding devices described with reference to FIG. 2.


The system 300 may also include a memory controller 301, which may include aspects of a device memory controller or a local memory controller described with reference to FIGS. 1 and 2, and a memory array 303, which may include rows and columns of memory cells. The ECC block 275-a may perform error correction, such as an SEC operation, on portions of the memory array 303. That is, the ECC block 275-a may check a first portion of a row for data errors in connection with a refresh operation, and refrain from checking a second portion of the row for data errors in connection with the refresh operation. The memory controller 301 may include a controller logic component 302 configured to receive or process commands, such as refresh commands, from a host device. The commands may be decoded by a command/address (C/A) decode component 320. For example. the C/A decode component 320 may be configured to determine whether a command is a refresh command or a refresh with ECC command. The memory controller 301 may also include a row multiplexer (MUX) 321 and a column MUX 322, which may be configured to issue row and column addresses to the row decoder 220-a and the column decoder 225-a as part of, for example, a refresh operation. The refresh operation may include accessing a row of the memory array 303 and refreshing the data stored in the row (e.g., writing the data stored in the row back to the row).


The memory controller 301 may include a refresh counter 305, which may be configured to track and store a value associated with a quantity of refresh operations performed at the memory array 303 (e.g., since a most recent reset of the refresh counter 305). The refresh counter 305 may indicate a row or set of rows to be refreshed to the row MUX 321, which may in turn indicate the row or set of rows to be refreshed to the row decoder 220-a and the memory array 303. For example, upon receiving a refresh indication 330 from the controller logic component 302, the refresh counter 305 may issue to the row MUX 321 an indication 311 of the row or set of rows to be refreshed based on (e.g., using) the value of the refresh counter 305, and the value of the refresh counter 305 may be incremented. In some cases, the value of the refresh counter 305 may be reset (e.g., reset to zero) if the incremented value thereof would exceed the quantity of rows in the memory array 303 or the counter otherwise reaches a threshold value or rolls over. That is, for example, after refreshing each row of the memory array 303, the refresh counter 305 may be reset (e.g., to an initial value). Thus, upon receiving a quantity of refresh indications 330 equal to the quantity of rows of the memory array 303 (or set of rows for refresh purposes), each row of the memory array 303 may be refreshed.


The memory controller may also include an ECC patrol block 310. The ECC patrol block 310 may include a counter 315, which may indicate a portion of a row on which error correction is to be performed. For example, the counter 315 may indicate an address of one or more logical columns of the row on which error correction is to be performed. The quantity of logical columns included in the portion of the row (i.e., a granularity with which the row is divided into portions for ECC patrol purposes, which may correspond to how many portions into which the row is divided) may be configured using a command, through firmware, or through user input, among other examples. In some examples, the portion of a row may correspond to a quantity of columns from which a burst of data is to be read, and may be referred to as a pre-fetch unit. In other examples, the portion of the row may include the entire row of the memory array 303, or any quantity of logical columns of the memory array 303. The ECC patrol block 310 may issue an indication of the portion of the row to the column MUX 322 and the ECC block 275-a. The column MUX 322 may, in response to the indication, issue the indication to the column decoder 225-a, which may issue the indication to the input/output 255-a, where the input/output 255-a may be configured to transfer the address of the portion of the row to the ECC block 275-a. The counter 315 may be reset upon reaching a threshold—e.g., once a value of the counter 315 corresponds to a last row portion (e.g., last set of columns) within the memory array 303, a next incrementing of the counter 315 may cause the value of the counter to reset (e.g., roll over).


The ECC block 275-a may perform error correction on the portion of the row, and issue the results (e.g., an indication of any corrected bits corrected as a result of the error correction) to the memory array as part of the refresh operation (i.e., via the input/output 255-a and the sense component 245-a). That is, the ECC block 275-a may determine whether the portion of the row includes a data error (e.g., an SBE) and, in some examples, correct the data error in the portion of the row. To identify and correct errors in a portion of the row, the ECC block 275-a may generate one or more parity bits for the portion of the row and compare the parity bits with parity bits corresponding to the portion of the row that have been previously stored.


The counter 315 may store and increment a value indicating the portion of the row upon which the ECC block 275-a is to perform error correction (i.e., an address counter or column counter). For example, the memory cells of each row of the memory array 303 may be grouped into a quantity of units (e.g., portions of the row, such as pre-fetch units), which may be indexed, where the indices correspond to possible values of the counter 315.


In some examples, upon receiving a refresh with ECC indication 335, the ECC patrol block 310 may also receive an indication 311 of the value of the refresh counter 305, and the ECC patrol block 310 may be configured to increment the counter 315 based on the value of the refresh counter 305 (e.g., based on the refresh counter 305 being reset). Additionally or alternatively, the ECC patrol block 310 may be configured to increment the counter 315 based on the quantity of refresh with ECC indications 335 received. For example, the ECC patrol block may include an error correction counter 316, which may be configured to be incremented each time a refresh with ECC indication 335 is received, and the ECC patrol block 310 may be configured to increment the counter 315 based on the value of the error correction counter 316 (e.g., based on the error correction counter 316 being reset).


The quantity of refresh with ECC indications 335 issued per refresh indication 330 may be managed by one or both of the host device and the system 300. For example, a refresh with ECC indication 335 may be issued once per period, where the period may represent a quantity of refresh indications 330. In some cases, the period may be a quantity p of refresh cycles, where a refresh cycle may be the quantity of refresh operations used to refresh each row of the memory array 303. Thus, for example, p multiplied by the quantity of refresh operations in a refresh cycle (e.g., p multiplied by a quantity of refresh operations used to refresh each row of the memory array 303 one time) may correspond to (e.g., equal) a quantity of regular refresh operations performed in between each successive refresh with ECC operation. In some other cases, the period may be a quantity p of refresh indications 330 (e.g., for every two refresh indications 330, one refresh with ECC indication 335 may be issued). Thus, a period may in some cases be a fraction of a refresh cycle.


In some cases, the host device may send refresh commands to the system 300, for example as part of an auto-refresh mode. In such cases, the host device may include a refresh handler, which may be configured to manually or automatically adjust the quantity p. Additionally or alternatively, the system 300 may operate using a self-refresh mode, in which the system 300 performs refresh operations without receiving a refresh command from the host device. If operating in a self-refresh mode, the system 300 may include a mode register 319 used to store the quantity p, and may determine the period during a self-refresh operation. In some cases, such as upon a self-refresh entry or self-refresh exit, one refresh with ECC indication 335 may be issued. That is, a refresh with ECC indication 335 may executed by the system 300 or by a refresh handler within a host device upon entering the self-refresh mode and upon exiting the self-refresh mode.


In some examples, the ECC patrol block 310 may increment the value of the counter 315 in response to the refresh counter 305 being reset (i.e., reset to zero) or otherwise reaching some threshold. If a refresh with ECC is performed on each row of the memory array 330 successively (e.g., if p is zero), then resetting the refresh counter in such fashion may cause the counter 315 to be reset upon performing the error correction operation on the last (e.g., end) portion of the final row. That is, if the value of the counter 315 corresponds to the last (e.g., end) portion of the row and the refresh counter 305 is then reset, the counter 315 may be reset, and where p is zero, this may mean that refresh with ECC has most recently been performed on the last portion of the final row.


If, however, some quantity of regular refresh operations are performed between successive refresh with ECC operations (e.g., p has a non-zero value), then incrementing the counter 315 in response to the refresh counter 305 being reset (i.e., set to zero) or otherwise reaching some threshold may cause the counter 315 to be reset upon performing the error correction operation on the last (e.g., end) portion of any row. For example, if two regular refresh operations are performed between successive refresh with ECC operations, a refresh with ECC operation may be performed on the end portion of one of the two rows preceding the final row, then the refresh counter 305 may be reset based on a regular refresh operation corresponding to the final row, and hence a next refresh with ECC operation may be performed on a first portion of the first or second row of the memory array 330.


In some examples, the counter 315 may be incremented in response to refresh with ECC being performed on a certain portion (e.g., corresponding to particular portion index) of all rows of the memory array 330 (e.g., once refresh with ECC has been performed on the first portion of each row of the memory array, the counter 315 may be incremented, and then once refresh with ECC has been performed on the second portion of each row of the memory array, the counter 315 may again be incremented, and so on). For example, if the counter 315 is incremented in response to the error correction counter 316 being reset, then the counter 315 may not increment until refresh with ECC has been performed on a certain portion (e.g., a group of columns) across all rows of the memory array 330. If the value of p is zero, this may cause refresh with ECC to be performed sequentially across all rows of the memory array 330 within each successive portion. If, however, the value of p is non-zero, this may cause refresh with ECC to be performed across the rows of the memory array 330 within a given portion in non-sequential fashion (e.g., the row being refreshed may change based on the separate incrementing of the refresh counter 305, but the value of the counter 315—and hence the portion subject to refresh with ECC—may not change until the error correction counter 316 resets).


By incrementing the value of the refresh counter 305 and the counter 315 (and the error correction counter 316, if present) as described herein, the ECC block 275-a may perform error correction on each portion of each row of the memory array 303 over time.



FIG. 4 illustrates an example of a process flow 400 that supports techniques for memory error correction in accordance with examples as disclosed herein. The process flow 400 may be performed by components of a memory system, such as a controller (e.g., a memory controller 301 as described with reference to FIG. 3), which may include an ECC patrol block (e.g., the ECC patrol block 310). Additionally or alternatively, aspects of the process flow 400 may be implemented as instructions stored in memory (e.g., firmware stored in a memory coupled with a device memory controller 155 or a local memory controller 165 described with reference to FIG. 1). For example, the instructions, if executed by a controller (e.g., a device memory controller 155 or a local controller 165), may cause the controller to perform the operations of the process flow 400. In the following description of process flow 400, the operations may be performed in a different order than the order shown. For example, specific operations may also be left out of process flow 400, or other operations may be added to process flow 400.


At 405, a refresh operation may be identified. For example, the memory controller may receive an external refresh command from a host device to refresh a row of a memory array. Additionally or alternatively, the memory controller may be operating in a self-refresh mode, and the memory controller may be configured to issue refresh indications for the memory array.


In some cases, the memory system may identify a refresh with ECC operation as part of identifying the refresh operation. For example, a refresh with ECC indication may be issued, either based on a refresh with ECC command received from the host device or based on a mode register at the memory controller. In some examples, the refresh with ECC indication may be issued based on a periodicity, as described with reference to FIG. 3. Thus, at 410, it may be determined whether the refresh operation identified at 405 is a refresh with ECC operation. For example, the memory controller may determine whether the refresh operation is associated with a refresh command or a refresh with ECC command received from the host device.


In some cases, it may be determined that the refresh operation is a refresh with ECC operation. In such cases, at 415, a refresh counter (e.g., the refresh counter 305 as described with reference to FIG. 3) may be incremented. For example, in response to identifying the refresh operation at 405, the memory controller may increment the value of the refresh counter to indicate a row of the memory array to be refreshed, as described with reference to FIG. 3.


At 420, it may be determined whether to reset to refresh counter. For example, the value of the refresh counter may exceed the quantity of rows of the memory array, indicating that the refresh operation identified at 405 corresponds to a first row of the memory array (i.e., a previous refresh operation may have refreshed the last row of the memory array). Thus, by comparing the value of the refresh counter to a threshold, such as the quantity of rows of the memory device, the value of the refresh counter may be reset at 425. That is, in response to determining that the refresh counter exceeds the threshold, the memory controller may reset the refresh counter.


Optionally, at 430, it may be determined whether to reset an error correction counter (e.g., the error correction counter 316 of the ECC patrol block 310 as described with reference to FIG. 3) at the ECC patrol block. For example, if the error correction counter exceeds a threshold, such as the quantity of rows in the memory array, the ECC patrol block may reset (i.e., set to zero) the error correction counter at 435.


At 440, an address counter (e.g., the counter 315 of the ECC patrol block 310 as described with reference to FIG. 3) may be incremented. For example, the memory controller may increment the address counter in response to the refresh counter being reset at 425. The address counter may identify a portion of the row of memory cells, for example as described with reference to FIG. 3.


At 445, a row may be accessed. For example, the memory controller may access a row indicated by the value of the refresh counter of the memory controller. Using the value of the address counter and, in some cases, the error correction counter, the memory controller may issue an indication to an ECC block (e.g., the ECC block 275-a as described with reference to FIG. 3) of a portion of the accessed row on which to perform an error correction operation.


At 450, ECC may be performed on a portion of the row accessed at 445. For example, the ECC block 275-a may perform error correction, such as a SEC operation, on the portion of the row indicated by the refresh counter and the address counter. The error correction may include generating one or more parity bits for the portion of the row and compare the parity bits with parity bits corresponding to the portion of the row that have been previously stored. That is, the ECC block may check a first portion of a row for data errors in connection with a refresh operation, and refrain from checking a second portion of the row for data errors in connection with the refresh operation.


Additionally or alternatively, it may be determined at 410 that the refresh operation is not a refresh with ECC operation. In such cases, at 455, a row may be accessed. For example, the memory controller may access a row indicated by the value of the refresh counter of the memory controller and write back data of the row (e.g., as part of a refresh operation at 460).


At 460, the row may be refreshed. For example, the memory controller may refresh the row of the memory array indicated by the refresh counter by writing back the data of row after accessing the row. In some cases (i.e., if error correction has been performed), writing back the data may include writing back the data that has been corrected as part of the error correction procedure at 450.


Aspects of the process flow 400 may be implemented by a controller, among other components. Additionally or alternatively, aspects of the process flow 400 may be implemented as instructions stored in memory (e.g., firmware stored in a memory coupled with a memory system). For example, the instructions, executed by a controller (e.g., an external memory controller 120, a device memory controller 155, a local memory controller 260, or a combination thereof), may cause the controller to perform the operations of the process flow 400.



FIG. 5 shows a block diagram 500 of a memory device 520 that supports techniques for memory error correction in accordance with examples as disclosed herein. The memory device 520 may be an example of aspects of a memory device as described with reference to FIGS. 1 through 4. The memory device 520, or various components thereof, may be an example of means for performing various aspects of techniques for memory error correction as described herein. For example, the memory device 520 may include a command manager 525, a row access component 530, an error correction manager 535, a counter manager 540, a period manager 545, an address manager 550, or any combination thereof. Each of these components may communicate, directly or indirectly, with one another (e.g., via one or more buses).


The command manager 525 may be configured as or otherwise support a means for identifying, at a memory system, a refresh operation for a row of memory cells within a memory array. The row access component 530 may be configured as or otherwise support a means for accessing the row of memory cells within the memory array in response to identifying the refresh operation. The error correction manager 535 may be configured as or otherwise support a means for determining whether the row includes a data error based at least in part on accessing the row in response to identifying the refresh operation. In some examples, the error correction manager 535 may be configured as or otherwise support a means for correcting the data error using an error correction procedure based at least in part on determining that the row includes the data error.


In some examples, the counter manager 540 may be configured as or otherwise support a means for incrementing a value of a refresh counter in response to identifying the refresh operation, where accessing the row of memory cells is based at least in part on the value of the refresh counter.


In some examples, the refresh counter is configured to be reset to an initial value based at least in part on the value of the refresh counter satisfying a threshold, and the counter manager 540 may be configured as or otherwise support a means for incrementing a value of an address counter based at least in part on the value of the refresh counter being reset to the initial value. In some examples, the refresh counter is configured to be reset to an initial value based at least in part on the value of the refresh counter satisfying a threshold, and the address manager 550 may be configured as or otherwise support a means for accessing at least a portion of data in the row based at least in part on the value of the address counter, where determining whether the row includes the data error includes determining whether at least the portion of data in the row includes the data error based at least in part on accessing at least the portion of data.


In some examples, the counter manager 540 may be configured as or otherwise support a means for incrementing a value of an error correction counter in response to identifying the refresh operation, where the error correction counter is configured to be reset to an initial value based at least in part on the value of the error correction counter satisfying a threshold. In some examples, the counter manager 540 may be configured as or otherwise support a means for incrementing a value of an address counter based at least in part on the value of the error correction counter being reset to the initial value. In some examples, the row access component 530 may be configured as or otherwise support a means for accessing at least a portion of data in the row based at least in part on the value of the address counter, where determining whether the row includes the data error includes determining whether at least the portion of data in the row includes the data error based at least in part on accessing at least the portion of data.


In some examples, the command manager 525 may be configured as or otherwise support a means for receiving a first refresh command, where identifying the refresh operation is based at least in part on receiving the first refresh command. In some examples, the command manager 525 may be configured as or otherwise support a means for identifying the first refresh command as a first type of refresh command, where determining whether the row includes the data error is in response to identifying the first refresh command as the first type of refresh command.


In some examples, the command manager 525 may be configured as or otherwise support a means for receiving a second refresh command. In some examples, the command manager 525 may be configured as or otherwise support a means for identifying the second refresh command as a second type of refresh command different than the first type of refresh command. In some examples, the error correction manager 535 may be configured as or otherwise support a means for refraining from performing a second error detection procedure in response to the second refresh command based at least in part on identifying the second refresh command as the second type of refresh command.


In some examples, the period manager 545 may be configured as or otherwise support a means for identifying a periodicity associated with checking for data errors in connection with refresh operations, the periodicity corresponding to a quantity of intervening refresh operations without error detection between refresh operations with error detection, where determining whether the row includes the data error based at least in part on the periodicity.


In some examples, the period manager 545 may be configured as or otherwise support a means for identifying the periodicity based at least in part on a value stored at a memory device.


In some examples, the error correction manager 535 may be configured as or otherwise support a means for determining whether the row includes the data error in response to identifying the refresh operation based at least in part on the refresh operation being an initial refresh operation of a set of self-refresh operations, a final refresh operation of the set of self-refresh operations, an initial refresh operation of a set of commanded refresh operations, or a final refresh operation of the set of commanded refresh operations.


In some examples, the address manager 550 may be configured as or otherwise support a means for determining an address associated with a portion of data in the row based at least in part on a value of an address counter, where accessing the row of memory cells includes accessing the portion of data. In some examples, the error correction manager 535 may be configured as or otherwise support a means for generating one or more parity bits for the portion of data based at least in part on accessing the portion of data. In some examples, the error correction manager 535 may be configured as or otherwise support a means for comparing the one or more generated parity bits for the portion of data with one or more parity bits previously stored for the portion of data, where determining whether the row includes the data error is based at least in part on the comparing.


In some examples, the row access component 530 may be configured as or otherwise support a means for refreshing the row of memory cells as part of the refresh operation, where determining whether the row includes the data error further includes. In some examples, the error correction manager 535 may be configured as or otherwise support a means for checking a first portion of the row for data errors in connection with the refresh operation. In some examples, the error correction manager 535 may be configured as or otherwise support a means for refraining from checking a second portion of the row for data errors in connection with the refresh operation.


In some examples, the row access component 530 may be configured as or otherwise support a means for identifying a size of the first portion of the row based at least in part on a value stored at a memory device.


In some examples, to support determining whether the row includes the data error, the error correction manager 535 may be configured as or otherwise support a means for performing a single error correction (SEC) procedure for at least a portion of data stored in the row.



FIG. 6 shows a flowchart illustrating a method 600 that supports techniques for memory error correction in accordance with examples as disclosed herein. The operations of method 600 may be implemented by a memory device or its components as described herein. For example, the operations of method 600 may be performed by a memory device as described with reference to FIGS. 1 through 5. In some examples, a memory device may execute a set of instructions to control the functional elements of the device to perform the described functions. Additionally or alternatively, the memory device may perform aspects of the described functions using special-purpose hardware.


At 605, the method may include identifying, at a memory system, a refresh operation for a row of memory cells within a memory array. The operations of 605 may be performed in accordance with examples as disclosed herein. In some examples, aspects of the operations of 605 may be performed by a command manager 525 as described with reference to FIG. 5.


At 610, the method may include accessing the row of memory cells within the memory array in response to identifying the refresh operation. The operations of 610 may be performed in accordance with examples as disclosed herein. In some examples, aspects of the operations of 610 may be performed by a row access component 530 as described with reference to FIG. 5.


At 615, the method may include determining whether the row includes a data error based at least in part on accessing the row in response to identifying the refresh operation. The operations of 615 may be performed in accordance with examples as disclosed herein. In some examples, aspects of the operations of 615 may be performed by an error correction manager 535 as described with reference to FIG. 5.


At 620, the method may include correcting the data error using an error correction procedure based at least in part on determining that the row includes the data error. The operations of 620 may be performed in accordance with examples as disclosed herein. In some examples, aspects of the operations of 620 may be performed by an error correction manager 535 as described with reference to FIG. 5.


In some examples, an apparatus as described herein may perform a method or methods, such as the method 600. The apparatus may include, features, circuitry, logic, means, or instructions (e.g., a non-transitory computer-readable medium storing instructions executable by a processor) for identifying, at a memory system, a refresh operation for a row of memory cells within a memory array, accessing the row of memory cells within the memory array in response to identifying the refresh operation, determining whether the row includes a data error based at least in part on accessing the row in response to identifying the refresh operation, and correcting the data error using an error correction procedure based at least in part on determining that the row includes the data error.


Some examples of the method 600 and the apparatus described herein may further include operations, features, circuitry, logic, means, or instructions for incrementing a value of a refresh counter in response to identifying the refresh operation, where accessing the row of memory cells may be based at least in part on the value of the refresh counter.


In some examples of the method 600 and the apparatus described herein, the refresh counter may be configured to be reset to an initial value based at least in part on the value of the refresh counter satisfying a threshold, and the method, apparatuses, and non-transitory computer-readable medium may include further operations, features, circuitry, logic, means, or instructions for incrementing a value of an address counter based at least in part on the value of the refresh counter being reset to the initial value and for accessing at least a portion of data in the row based at least in part on the value of the address counter, where determining whether the row includes the data error includes determining whether at least the portion of data in the row includes the data error based at least in part on accessing at least the portion of data.


Some examples of the method 600 and the apparatus described herein may further include operations, features, circuitry, logic, means, or instructions for incrementing a value of an error correction counter in response to identifying the refresh operation, where the error correction counter may be configured to be reset to an initial value based at least in part on the value of the error correction counter satisfying a threshold, incrementing a value of an address counter based at least in part on the value of the error correction counter being reset to the initial value, and accessing at least a portion of data in the row based at least in part on the value of the address counter, where determining whether the row includes the data error includes determining whether at least the portion of data in the row includes the data error based at least in part on accessing at least the portion of data.


Some examples of the method 600 and the apparatus described herein may further include operations, features, circuitry, logic, means, or instructions for receiving a first refresh command, where identifying the refresh operation may be based at least in part on receiving the first refresh command, and identifying the first refresh command as a first type of refresh command, where determining whether the row includes the data error may be in response to identifying the first refresh command as the first type of refresh command.


Some examples of the method 600 and the apparatus described herein may further include operations, features, circuitry, logic, means, or instructions for receiving a second refresh command, identifying the second refresh command as a second type of refresh command different than the first type of refresh command, and refraining from performing a second error detection procedure in response to the second refresh command based at least in part on identifying the second refresh command as the second type of refresh command.


Some examples of the method 600 and the apparatus described herein may further include operations, features, circuitry, logic, means, or instructions for identifying a periodicity associated with checking for data errors in connection with refresh operations, the periodicity corresponding to a quantity of intervening refresh operations without error detection between refresh operations with error detection, where determining whether the row includes the data error is based at least in part on the periodicity.


Some examples of the method 600 and the apparatus described herein may further include operations, features, circuitry, logic, means, or instructions for identifying the periodicity based at least in part on a value stored at a memory device.


Some examples of the method 600 and the apparatus described herein may further include operations, features, circuitry, logic, means, or instructions for determining whether the row includes the data error in response to identifying the refresh operation based at least in part on the refresh operation being an initial refresh operation of a set of self-refresh operations, a final refresh operation of the set of self-refresh operations, an initial refresh operation of a set of commanded refresh operations, or a final refresh operation of the set of commanded refresh operations.


Some examples of the method 600 and the apparatus described herein may further include operations, features, circuitry, logic, means, or instructions for determining an address associated with a portion of data in the row based at least in part on a value of an address counter, where accessing the row of memory cells includes accessing the portion of data, generating one or more parity bits for the portion of data based at least in part on accessing the portion of data, and comparing the one or more generated parity bits for the portion of data with one or more parity bits previously stored for the portion of data, where determining whether the row includes the data error may be based at least in part on the comparing.


Some examples of the method 600 and the apparatus described herein may further include operations, features, circuitry, logic, means, or instructions for refreshing the row of memory cells as part of the refresh operation, where determining whether the row includes the data error includes checking a first portion of the row for data errors in connection with the refresh operation, and refraining from checking a second portion of the row for data errors in connection with the refresh operation.


Some examples of the method 600 and the apparatus described herein may further include operations, features, circuitry, logic, means, or instructions for identifying a size of the first portion of the row based at least in part on a value stored at a memory device.


In some examples of the method 600 and the apparatus described herein, operations, features, circuitry, logic, means, or instructions for determining whether the row includes the data error may include operations, features, circuitry, logic, means, or instructions for performing an SEC procedure for at least a portion of data stored in the row.


In some examples of the method 600 and the apparatus described herein, operations, features, circuitry, logic, means, or instructions for determining whether the row includes the data error may include operations, features, circuitry, logic, means, or instructions for performing a single error correction (SEC) procedure for at least a portion of data stored in the row.


It should be noted that the methods described herein describe possible implementations, and that the operations and the steps may be rearranged or otherwise modified and that other implementations are possible. Further, portions from two or more of the methods may be combined.


Information and signals described herein may be represented using any of a variety of different technologies and techniques. For example, data, instructions, commands, information, signals, bits, symbols, and chips that may be referenced throughout the above description may be represented by voltages, currents, electromagnetic waves, magnetic fields or particles, optical fields or particles, or any combination thereof. Some drawings may illustrate signals as a single signal; however, the signal may represent a bus of signals, where the bus may have a variety of bit widths.


The terms “electronic communication,” “conductive contact,” “connected,” and “coupled” may refer to a relationship between components that supports the flow of signals between the components. Components are considered in electronic communication with (or in conductive contact with or connected with or coupled with) one another if there is any conductive path between the components that can, at any time, support the flow of signals between the components. At any given time, the conductive path between components that are in electronic communication with each other (or in conductive contact with or connected with or coupled with) may be an open circuit or a closed circuit based on (e.g., in response to) the operation of the device that includes the connected components. The conductive path between connected components may be a direct conductive path between the components or the conductive path between connected components may be an indirect conductive path that may include intermediate components, such as switches, transistors, or other components. In some examples, the flow of signals between the connected components may be interrupted for a time, for example, using one or more intermediate components such as switches or transistors.


The term “coupling” refers to condition of moving from an open-circuit relationship between components in which signals are not presently capable of being communicated between the components over a conductive path to a closed-circuit relationship between components in which signals are capable of being communicated between components over the conductive path. When a component, such as a controller, couples other components together, the component initiates a change that allows signals to flow between the other components over a conductive path that previously did not permit signals to flow.


The term “isolated” refers to a relationship between components in which signals are not presently capable of flowing between the components. Components are isolated from each other if there is an open circuit between them. For example, two components separated by a switch that is positioned between the components are isolated from each other when the switch is open. When a controller isolates two components, the controller affects a change that prevents signals from flowing between the components using a conductive path that previously permitted signals to flow.


The devices discussed herein, including a memory array, may be formed on a semiconductor substrate, such as silicon, germanium, silicon-germanium alloy, gallium arsenide, gallium nitride, etc. In some examples, the substrate is a semiconductor wafer. In other examples, the substrate may be a silicon-on-insulator (SOI) substrate, such as silicon-on-glass (SOG) or silicon-on-sapphire (SOP), or epitaxial layers of semiconductor materials on another substrate. The conductivity of the substrate, or sub-regions of the substrate, may be controlled through doping using various chemical species including, but not limited to, phosphorous, boron, or arsenic. Doping may be performed during the initial formation or growth of the substrate, by ion-implantation, or by any other doping means.


A switching component or a transistor discussed herein may represent a field-effect transistor (FET) and comprise a three terminal device including a source, drain, and gate. The terminals may be connected to other electronic elements through conductive materials, e.g., metals. The source and drain may be conductive and may comprise a heavily-doped, e.g., degenerate, semiconductor region. The source and drain may be separated by a lightly-doped semiconductor region or channel. If the channel is n-type (i.e., majority carriers are electrons), then the FET may be referred to as a n-type FET. If the channel is p-type (i.e., majority carriers are holes), then the FET may be referred to as a p-type FET. The channel may be capped by an insulating gate oxide. The channel conductivity may be controlled by applying a voltage to the gate. For example, applying a positive voltage or negative voltage to an n-type FET or a p-type FET, respectively, may result in the channel becoming conductive. A transistor may be “on” or “activated” when a voltage greater than or equal to the transistor's threshold voltage is applied to the transistor gate. The transistor may be “off” or “deactivated” when a voltage less than the transistor's threshold voltage is applied to the transistor gate.


The description set forth herein, in connection with the appended drawings, describes example configurations and does not represent all the examples that may be implemented or that are within the scope of the claims. The term “exemplary” used herein means “serving as an example, instance, or illustration,” and not “preferred” or “advantageous over other examples.” The detailed description includes specific details to providing an understanding of the described techniques. These techniques, however, may be practiced without these specific details. In some instances, well-known structures and devices are shown in block diagram form to avoid obscuring the concepts of the described examples.


In the appended figures, similar components or features may have the same reference label. Further, various components of the same type may be distinguished by following the reference label by a dash and a second label that distinguishes among the similar components. If just the first reference label is used in the specification, the description is applicable to any one of the similar components having the same first reference label irrespective of the second reference label.


The functions described herein may be implemented in hardware, software executed by a processor, firmware, or any combination thereof. If implemented in software executed by a processor, the functions may be stored on or transmitted over as one or more instructions or code on a computer-readable medium. Other examples and implementations are within the scope of the disclosure and appended claims. For example, due to the nature of software, functions described herein can be implemented using software executed by a processor, hardware, firmware, hardwiring, or combinations of any of these. Features implementing functions may also be physically located at various positions, including being distributed such that portions of functions are implemented at different physical locations.


For example, the various illustrative blocks and modules described in connection with the disclosure herein may be implemented or performed with a general-purpose processor, a DSP, an ASIC, an FPGA or other programmable logic device, discrete gate or transistor logic, discrete hardware components, or any combination thereof designed to perform the functions described herein. A general-purpose processor may be a microprocessor, but in the alternative, the processor may be any processor, controller, microcontroller, or state machine. A processor may also be implemented as a combination of computing devices (e.g., a combination of a DSP and a microprocessor, multiple microprocessors, one or more microprocessors in conjunction with a DSP core, or any other such configuration).


As used herein, including in the claims, “or” as used in a list of items (for example, a list of items prefaced by a phrase such as “at least one of” or “one or more of”) indicates an inclusive list such that, for example, a list of at least one of A, B, or C means A or B or C or AB or AC or BC or ABC (i.e., A and B and C). Also, as used herein, the phrase “based on” shall not be construed as a reference to a closed set of conditions. For example, an exemplary step that is described as “based on condition A” may be based on both a condition A and a condition B without departing from the scope of the present disclosure. In other words, as used herein, the phrase “based on” shall be construed in the same manner as the phrase “based at least in part on.”


Computer-readable media includes both non-transitory computer storage media and communication media including any medium that facilitates transfer of a computer program from one place to another. A non-transitory storage medium may be any available medium that can be accessed by a general purpose or special purpose computer. By way of example, and not limitation, non-transitory computer-readable media can comprise RAM, ROM, electrically erasable programmable read-only memory (EEPROM), compact disk (CD) ROM or other optical disk storage, magnetic disk storage or other magnetic storage devices, or any other non-transitory medium that can be used to carry or store desired program code means in the form of instructions or data structures and that can be accessed by a general-purpose or special-purpose computer, or a general-purpose or special-purpose processor. Also, any connection is properly termed a computer-readable medium. For example, if the software is transmitted from a website, server, or other remote source using a coaxial cable, fiber optic cable, twisted pair, digital subscriber line (DSL), or wireless technologies such as infrared, radio, and microwave, then the coaxial cable, fiber optic cable, twisted pair, digital subscriber line (DSL), or wireless technologies such as infrared, radio, and microwave are included in the definition of medium. Disk and disc, as used herein, include CD, laser disc, optical disc, digital versatile disc (DVD), floppy disk and Blu-ray disc where disks usually reproduce data magnetically, while discs reproduce data optically with lasers. Combinations of the above are also included within the scope of computer-readable media.


The description herein is provided to enable a person skilled in the art to make or use the disclosure. Various modifications to the disclosure will be apparent to those skilled in the art, and the generic principles defined herein may be applied to other variations without departing from the scope of the disclosure. Thus, the disclosure is not limited to the examples and designs described herein, but is to be accorded the broadest scope consistent with the principles and novel features disclosed herein.

Claims
  • 1. An apparatus, comprising: a memory device;a controller for the memory device and configured to cause the apparatus to: identify, at the memory device, a refresh operation for a row of memory cells within a memory array;access the row of memory cells within the memory array in response to identifying the refresh operation;determine whether the row includes a data error based at least in part on accessing the row in response to identifying the refresh operation; andcorrect the data error using an error correction procedure based at least in part on determining that the row includes the data error.
  • 2. The apparatus of claim 1, wherein the controller is further configured to cause the apparatus to: increment a value of a refresh counter in response to identifying the refresh operation, wherein accessing the row of memory cells is based at least in part on the value of the refresh counter.
  • 3. The apparatus of claim 2, wherein: the refresh counter is configured to be reset to an initial value based at least in part on the value of the refresh counter satisfying a threshold; andthe controller is further configured to cause the apparatus to: increment a value of an address counter based at least in part on the value of the refresh counter being reset to the initial value; andaccess at least a portion of data in the row based at least in part on the value of the address counter, wherein determining whether the row includes the data error comprises determining whether at least the portion of data in the row includes the data error based at least in part on accessing at least the portion of data.
  • 4. The apparatus of claim 1, wherein the controller is further configured to cause the apparatus to: increment a value of an error correction counter in response to identifying the refresh operation, wherein the error correction counter is configured to be reset to an initial value based at least in part on the value of the error correction counter satisfying a threshold;increment a value of an address counter based at least in part on the value of the error correction counter being reset to the initial value; andaccess at least a portion of data in the row based at least in part on the value of the address counter, wherein determining whether the row includes the data error comprises determining whether at least the portion of data in the row includes the data error based at least in part on accessing at least the portion of data.
  • 5. The apparatus of claim 1, wherein the controller is further configured to cause the apparatus to: receive a first refresh command, wherein identifying the refresh operation is based at least in part on receiving the first refresh command; andidentify the first refresh command as a first type of refresh command, wherein determining whether the row includes the data error is in response to identifying the first refresh command as the first type of refresh command.
  • 6. The apparatus of claim 5, wherein the controller is further configured to cause the apparatus to: receive a second refresh command;identify the second refresh command as a second type of refresh command different than the first type of refresh command; andrefrain from performing a second error detection procedure in response to the second refresh command based at least in part on identifying the second refresh command as the second type of refresh command.
  • 7. The apparatus of claim 1, wherein the controller is further configured to cause the apparatus to: identify a periodicity associated with checking for data errors in connection with refresh operations, the periodicity corresponding to a quantity of intervening refresh operations without error detection between refresh operations with error detection, wherein the controller is configured to cause the apparatus to determine whether the row includes the data error based at least in part on the periodicity.
  • 8. The apparatus of claim 7, wherein the controller is further configured to cause the apparatus to: identify the periodicity based at least in part on a value stored at the memory device.
  • 9. The apparatus of claim 7, wherein the controller is configured to cause the apparatus to: determine whether the row includes the data error in response to identifying the refresh operation based at least in part on the refresh operation being an initial refresh operation of a set of self-refresh operations, a final refresh operation of the set of self-refresh operations, an initial refresh operation of a set of commanded refresh operations, or a final refresh operation of the set of commanded refresh operations.
  • 10. The apparatus of claim 1, wherein the controller is further configured to cause the apparatus to: determine an address associated with a portion of data in the row based at least in part on a value of an address counter, wherein accessing the row of memory cells comprises accessing the portion of data; andgenerate one or more parity bits for the portion of data based at least in part on accessing the portion of data; andcompare the one or more generated parity bits for the portion of data with one or more parity bits previously stored for the portion of data, wherein determining whether the row includes the data error is based at least in part on the comparing.
  • 11. The apparatus of claim 1, wherein the controller is further configured to cause the apparatus to: refresh the row of memory cells as part of the refresh operation, wherein, to determine whether the row includes the data error, the controller is configured to cause the apparatus to: check a first portion of the row for data errors in connection with the refresh operation; andrefrain from checking a second portion of the row for data errors in connection with the refresh operation.
  • 12. The apparatus of claim 11, wherein the controller is further configured to cause the apparatus to: identify a size of the first portion of the row based at least in part on a value stored at the memory device.
  • 13. The apparatus of claim 1, wherein, to determine whether the row includes the data error the controller is configured to cause the apparatus to: perform a single error correction (SEC) procedure for at least a portion of data stored in the row.
  • 14. A non-transitory computer-readable medium storing code comprising instructions which, when executed by a processor of an electronic device, cause the electronic device to: identify, at the electronic device, a refresh operation for a row of memory cells within a memory array;access the row of memory cells within the memory array in response to identifying the refresh operation;determine whether the row includes a data error based at least in part on accessing the row in response to identifying the refresh operation; andcorrect the data error using an error correction procedure based at least in part on determining that the row includes the data error.
  • 15. The non-transitory computer-readable medium of claim 14, wherein the instructions, when executed by the processor of the electronic device, further cause the electronic device to: increment a value of a refresh counter in response to identifying the refresh operation, wherein accessing the row of memory cells is based at least in part on the value of the refresh counter, and wherein the refresh counter is configured to be reset to an initial value based at least in part on the value of the refresh counter satisfying a threshold;increment a value of an address counter based at least in part on the value of the refresh counter being reset to the initial value; andaccess at least a portion of data in the row based at least in part on the value of the address counter, wherein determining whether the row includes the data error comprises determining whether at least the portion of data in the row includes the data error based at least in part on accessing at least the portion of data.
  • 16. The non-transitory computer-readable medium of claim 14, wherein the instructions, when executed by the processor of the electronic device, further cause the electronic device to: increment a value of an error correction counter in response to identifying the refresh operation, wherein the error correction counter is configured to be reset to an initial value based at least in part on the value of the error correction counter satisfying a threshold;increment a value of an address counter based at least in part on the value of the error correction counter being reset to the initial value; andaccess at least a portion of data in the row based at least in part on the value of the address counter, wherein determining whether the row includes the data error comprises determining whether at least the portion of data in the row includes the data error based at least in part on accessing at least the portion of data.
  • 17. The non-transitory computer-readable medium of claim 14, wherein the instructions, when executed by the processor of the electronic device, further cause the electronic device to: receive a first refresh command, wherein identifying the refresh operation is based at least in part on receiving the first refresh command; andidentify the first refresh command as a first type of refresh command, wherein determining whether the row includes the data error is in response to identifying the first refresh command as the first type of refresh command.
  • 18. The non-transitory computer-readable medium of claim 17, wherein the instructions, when executed by the processor of the electronic device, further cause the electronic device to: receive a second refresh command;identify the second refresh command as a second type of refresh command different than the first type of refresh command; andrefrain from performing a second error correction procedure in response to the second refresh command based at least in part on identifying the second refresh command as the second type of refresh command.
  • 19. The non-transitory computer-readable medium of claim 14, wherein the instructions, when executed by the processor of the electronic device, further cause the electronic device to: identify a periodicity associated with checking for data errors in connection with refresh operations, the periodicity corresponding to a quantity of intervening refresh operations without error detection between refresh operations with error detection, wherein the instructions are configured to cause the electronic device to determine whether the row includes the data error based at least in part on the periodicity.
  • 20. The non-transitory computer-readable medium of claim 14, wherein the instructions, when executed by the processor of the electronic device, further cause the electronic device to: determine an address associated with a portion of data in the row based at least in part on a value of an address counter, wherein accessing the row of memory cells comprises accessing the portion of data;generate one or more parity bits for the portion of data based at least in part on accessing the portion of data; andcompare the one or more generated parity bits for the portion of data with one or more parity bits previously stored for the portion of data, wherein determining whether the row includes the data error is based at least in part on the comparing.
  • 21. The non-transitory computer-readable medium of claim 14, wherein the instructions, when executed by the processor of the electronic device, further cause the electronic device to: refresh the row of memory cells as part of the refresh operation, wherein, to determine whether the row includes the data error, the instructions, when executed by the processor of the electronic device, cause the electronic device to: check a first portion of the row for data errors in connection with the refresh operation; andrefrain from checking a second portion of the row for data errors in connection with the refresh operation.
  • 22. A method, comprising: identifying, at a memory system, a refresh operation for a row of memory cells within a memory array;accessing the row of memory cells within the memory array in response to identifying the refresh operation;determining whether the row includes a data error based at least in part on accessing the row in response to identifying the refresh operation; andcorrecting the data error using an error correction procedure based at least in part on determining that the row includes the data error.
  • 23. The method of claim 22, further comprising: incrementing a value of a refresh counter in response to identifying the refresh operation, wherein accessing the row of memory cells is based at least in part on the value of the refresh counter, and wherein the refresh counter is configured to be reset to an initial value based at least in part on the value of the refresh counter satisfying a threshold;incrementing a value of an address counter based at least in part on the value of the refresh counter being reset to the initial value; andaccessing at least a portion of data in the row based at least in part on the value of the address counter, wherein determining whether the row includes the data error comprises determining whether at least the portion of data in the row includes the data error based at least in part on accessing at least the portion of data.
  • 24. The method of claim 22, further comprising: incrementing a value of an error correction counter in response to identifying the refresh operation, wherein the error correction counter is configured to be reset to an initial value based at least in part on the value of the error correction counter satisfying a threshold;incrementing a value of an address counter based at least in part on the value of the error correction counter being reset to the initial value; andaccessing at least a portion of data in the row based at least in part on the value of the address counter, wherein determining whether the row includes the data error comprises determining whether at least the portion of data in the row includes the data error based at least in part on accessing the portion of data.
  • 25. The method of claim 22, further comprising: receiving a first refresh command, wherein identifying the refresh operation is based at least in part on receiving the first refresh command;identifying the first refresh command as a first type of refresh command, wherein determining whether the row includes the data error is in response to identifying the first refresh command as the first type of refresh command;receiving a second refresh command;identifying the second refresh command as a second type of refresh command different than the first type of refresh command; andrefraining from performing a second error detection procedure in response to the second refresh command based at least in part on identifying the second refresh command as the second type of refresh command.
CROSS REFERENCE

The present Application for Patent claims priority to U.S. Provisional Patent Application No. 63/228,816 by Wang, entitled “TECHNIQUES FOR MEMORY ERROR CORRECTION” and filed Aug. 3, 2021, which is assigned to the assignee hereof and is expressly incorporated by reference in its entirety herein.

Provisional Applications (1)
Number Date Country
63228816 Aug 2021 US