This invention relates generally to random access memories, and more particularly to defect detection and repair of a random access memory embedded in a data processor.
Redundancy may be used in an integrated circuit random access memory (RAM) to cure, or repair, manufacturing defects by replacing rows or columns having defects with spare rows or columns. In order to repair a defective row or column, the defective row or column is deselected and a redundant row or column is assigned in its place by blowing a plurality of fusible links. The fusible links are used to store the address of the defective row or column and is typically blown using a high-energy laser, or may be blown electrically at probe test. The ability to repair a memory that has only a few defective rows or columns can result in substantially increased manufacturing yields. Redundancy may not be used in some embedded memories because the embedded memories may not be directly accessible.
Error correction codes (ECC) have been used to detect and correct single-bit errors and to detect, but not correct, multi-bit errors in memory arrays. The single-bit errors and multi-bit errors may be due to soft errors in the memory array. A soft error in a particular bit may be due to, for example, exposure to temperature extremes, alpha particle emissions, or long term usage. The ECC can correct single-bit errors without the use of additional redundant bits, but multi-bit errors typically cannot be corrected in the field even if the memory includes unused redundant rows or columns.
Therefore, there is a need for a memory that can repair ECC detected multi-bit errors.
The foregoing and further and more specific objects and advantages of the instant invention will become readily apparent to those skilled in the art from the following detailed description of a preferred embodiment thereof taken in conjunction with the following drawings:
Generally, the present invention provides a circuit and method for repairing defective memory cells in a volatile memory array of a data processor. Error correction codes are used to detect errors in the data stored in the memory array. The errors in the data indicate defective memory cells of a volatile memory array. In a typical ECC protocol, if a detected error is a single-bit error, the ECC can apply a correction. If an error is a multi-bit error, the ECC can detect the error but not correct it. The multi-bit errors in the volatile memory array detected by the ECC are corrected using a portion of a non-volatile memory array to store the addresses of the defective cells. During initial operation of the data processor as it exits a reset state, the addresses of the defective cells are loaded into a plurality of registers for storing the address of the defective memory cell during a normal operating mode of the integrated circuit. A plurality of flip-flops is used as needed to substitute for defective memory cells. The plurality of flip-flops is implemented on the integrated circuit physically separate from the volatile memory array using standard cell logic.
The circuit and method can be used to repair defective memory cells during manufacturing or can be used to increase reliability or fault tolerance in the field. The plurality of flip-flops is guaranteed to be defect free because they are structurally tested using scan testing during the manufacturing process. Also, the circuit and method can augment ECC by allowing multi-bit errors to be detected and repaired.
Data processor 10 includes a BIST circuit for functionally testing RAM 14. The BIST circuit writes a test pattern to RAM 14 and then reads RAM 14 to detect if the memory cells of RAM 14 output the expected data. The BIST circuit includes BIST engine 20 and multiplexer 22. Multiplexer 22 has a first input coupled to CPU 12 for receiving address RAM ADDRESS, a second input coupled to CPU 12 for receiving an address labeled “BIST ADDRESS”, an output coupled to RAM 14, and a control terminal coupled to BIST engine 20 for receiving a control signal labeled “BIST ENABLE”. BIST engine 20 also has a terminal coupled to RAM 14 for providing and receiving data signals labeled “BIST DATA”.
Read data from RAM 14 is provided to CPU 12 via ECC logic 26. In the illustrated embodiment, ECC logic 26 runs a conventional ECC protocol that can detect and repair single bit errors and detect but not repair multi-bit errors. ECC logic 26 has an input coupled to multiplexer 28, an output coupled to CPU 12 for reporting a multi-bit error, and another output coupled to CPU 12 for providing data labeled “READ DATA”. Multiplexer 28 receives read data from either RAM 14 or from the plurality of flip-flops 38 via multiplexer 36. ECC input data is provided by flip-flops 38 if one or more of flip-flops 38 have been used to repair addressed defective cells of RAM 14.
Write data labeled “WRITE DATA” is provided by CPU 12 to both of multiplexer 40 and RAM 14. If one or more of the addressed memory cells of RAM 14 have been repaired, then the WRITE DATA is provided to one of flip flops 38, otherwise, the WRITE DATA is written to the regular memory cells of RAM 14. The read and write operations of RAM 14 will be discussed in more detail below. Note that the plurality of flip-flops 38 comprises D-type flip-flops in the illustrated embodiment. The D-type flip-flops may be implemented as “standard cell logic” in a conventional integrated circuit manufacturing process such as a CMOS (complementary metal-oxide semiconductor) process. This provides an advantage of being highly reliable and relatively easy to test as compared to an embedded RAM such as RAM 14. In other embodiments, the plurality of D-type flip-flops 38 may comprise a different type of flip-flop.
CPU 12 is bi-directionally coupled to interrupt controller 24. Interrupt controller 24 may also be referred to as an interrupt handler, and can be implemented as hardware, software, or a combination of hardware and software. CPU 12 is also bi-directionally coupled to NVM 16 to transmit signals labeled “ADDRESS/CONTROL/DATA”. In the illustrated embodiment, NVM 16 includes a plurality of flash non-volatile memory cells organized in rows and columns. Also included in NVM 16 but not shown are row and column decoders, sense amplifiers, and other access circuitry. In other embodiments, NVM 16 may be another type of non-volatile memory, such as for example, an EEPROM (electrically programmable and erasable read only memory), or a MRAM (magnetic random access memory). NVM 16 also includes a shadow row 18. Shadow row 18 is one or more specially designated rows for storing test, manufacturing, or identifying information about the integrated circuit implementing data processor 10. In the illustrated embodiment, shadow row 18 is not visible to or accessible by a user of the data processor 10. The shadow row 18 is bi-directionally coupled to finite state machine 30 for transmitting and receiving information labeled “RAM REPAIR INFO”. Finite state machine also has an input for receiving a reset signal labeled “RESET” from reset controller 42, and a plurality of outputs for providing signals labeled “REPAIR ADDRESS” to holding registers 32. Reset controller may be responsive to any one or all of CPU 12, interrupt controller 24, or another component of data processor 10. Data processor 10 may also include components not illustrated in
Holding registers 32 is a plurality of conventional registers for storing the addresses of repaired memory cells of RAM 14. Holding registers 32 has an output coupled to a first input of address comparator 34. Address comparator 34 has a second input coupled to CPU 12 for receiving address signals labeled “RAM ADDRESS”, a first output for providing a hit signal labeled “READ HIT” to a control terminal of multiplexer 28, a second output for providing a select signal labeled “SELECT” to a control terminal of multiplexer 36, and a third output for providing a hit signal labeled “WRITE HIT” to a control terminal of multiplexer 40.
In operation, CPU 12 executes instructions that require data to be read from and written to RAM 14. ECC logic 26 analyses read data from RAM 14 and if a single-bit error is detected, ECC logic 26 corrects the error. In the case where a multi-bit error is detected, ECC logic 26 provides a signal to CPU 12 labeled “MULTI-BIT ERROR”. The error may be, for example, a soft error caused by exposure to temperature extremes or from prolonged usage. In response to the signal MULTI-BIT ERROR, an interrupt is generated by interrupt controller 24. The failing address or addresses are programmed into shadow row 18 of NVM 16. Also, one or more flip-flops of the plurality of flip-flops 38 are designated to replace the defective memory location of RAM 14 using, for example, BIST engine 20. In addition, the interrupt causes reset controller 42 to provide reset signal RESET to finite state machine 30 and to the entire data processor. The finite state machine 30 retrieves the address of the defective location of RAM 14, and causes the address to be stored in holding registers 32. Each time the data processor 10 is reset or restarted, the finite state machine loads the defective addresses from shadow row 18 to holding registers 32. Note that the defective memory locations may also be detected during BIST testing.
During a read operation of RAM 14 after RAM 14 has been repaired, a RAM ADDRESS is provided to RAM 14 via multiplexer 22 and to address comparator 34. Address comparator 34 compares the RAM ADDRESS to addresses stored in holding registers 32. If the RAM ADDRESS matches an address in holding registers 32, then a SELECT signal is provided to multiplexer 36 to select the correct flip-flop of flip-flops 38 to read. Address comparator 34 also provides a READ HIT signal to multiplexer 28 to select the input coupled to the output of multiplexer 36 to provide the input to ECC logic 26, whose output is the READ DATA to CPU 12. If the RAM ADDRESS does not match one of the addresses stored in holding registers 32, then the READ HIT signal is not asserted and READ DATA is provided to CPU 12 from RAM 14 via multiplexer 28 and ECC logic 26.
During a write operation of RAM 14 after RAM 14 has been repaired, a RAM ADDRESS is provided to RAM 14 and to address comparator 34. If there is a match, indicating that the RAM ADDRESS is to a location of RAM 14 that has been repaired, then address comparator provides a WRITE HIT signal to multiplexer 40 to allow the WRITE DATA to be provided to one of the plurality of flip-flops 38. If the RAM ADDRESS does not match one of the addresses stored in holding registers 32, then the WRITE HIT signal is not asserted and the WRITE DATA is provided from CPU 12 to RAM 14.
While the invention has been described in the context of a preferred embodiment, it will be apparent to those skilled in the art that the present invention may be modified in numerous ways and may assume many embodiments other than that specifically set out and described above. Accordingly, it is intended by the appended claims to cover all modifications of the invention which fall within the true scope of the invention.
Benefits, other advantages, and solutions to problems have been described above with regard to specific embodiments. However, the benefits, advantages, solutions to problems, and any element(s) that may cause any benefit, advantage, or solution to occur or become more pronounced are not to be construed as a critical, required, or essential feature or element of any or all the claims. As used herein, the terms “comprises,” “comprising,” or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus.