The present invention generally relates to a memory system, and more particularly to a memory system with redundant memory.
The amount of data capable of being stored in a fix amount of space has increased significantly in recent years. Improved circuit designs and better manufacturing techniques have reduced the size of the area on a semiconductor device where a single bit of data (i.e., a “0” or “1”) can be stored. This area, or cell, where a bit of data is stored, is sometimes known as a bitcell. Smaller bitcells allow for more data to be stored in the same amount of space. However, as bitcells have become smaller, atomic level imperfections in the semiconductor material have had an increasing effect on the functionality of the bitcells.
These imperfections may be introduced during the manufacturing process, in particular the doping process. Doping is the process of intentionally introducing impurities into a semiconductor to change its electrical properties. However, variations in the doping process, or other imperfections in the semiconductor material can cause random individual bitcells to fail, resulting in a random distribution of single-bit errors throughout the memory device.
In order to compensate for the random distribution of single-bit errors a memory system storing repair information and having a redundant storage or redundant area is used.
A cache is provided, including a data array comprising a plurality of bitcells configured to store data and a tag array, configured to store an index of the data stored at a corresponding location in the data array and further configured to store repair information, wherein the repair information indicates an error at the corresponding location in the data array.
A memory system is provided, including a first memory comprising a plurality of bitcells configured to store data, and a second memory, configured to store repair information, wherein the repair information indicates a bitcell error at the corresponding location in the first memory.
A method is provided, including retrieving, from a tag array in a cache system, repair information corresponding to a location of bitcells in a data array of the cache system, and correcting bitcells in the data array when the retrieved repair information indicates that an error is associated with the bitcells.
The present invention will hereinafter be described in conjunction with the following figures.
The following detailed description of embodiments is merely exemplary in nature and is not intended to limit the invention or the application and uses of the invention. Furthermore, there is no intention to be bound by any theory presented in the preceding background or the following detailed description.
In one embodiment, the first memory no may be, for example, a data array in a cache. The cache may be a computer processing unit (“CPU”) cache, a graphical processing unit (“GPU”) cache, a disk cache (e.g., a hard drive cache), a web cache or any other type of cache as is known in the art. The data that is stored within a cache might be values that have been computed earlier or duplicates of original data that are stored elsewhere. If requested data is contained in the cache (also called a cache hit), this request can be served by simply reading the cache, which is comparably faster than requesting the data from a traditional memory or recalculating the data.
Each block (e.g., block 112, etc.) illustrated in
The first memory no may be subject to the single-bit errors as discussed above. As seen in
The second memory 120 stores repair information corresponding to the single-bit errors in the first memory 110. The second memory is preferably a type memory which is less susceptible to the single-bit errors. The second memory 120 can be part of the first memory 110 or it can be a separate memory. If the second memory 120 is part of the same memory as the first memory 110, the second memory 120 can be designed to use larger bitcells and/or have a larger access voltage to be more resistant to random doping errors.
In one embodiment, for example, the second memory 120 may be a tag array in a cache. A tag array is typically used to store an identification of the data stored in the data array of the cache. For example, if the cache is storing data which is also stored in another memory, the tag may store the location of the data in the other memory and a location of the data stored within the cache. If, for example, the cache is a CPU cache, the processor first accesses the tag array to located the data within the data array of the cache before requesting the data to be transferred from the cache to the processor via the bus. One advantage of using a tag array to store repair information, for example, is that the cache controller (e.g., CPU, GPU, etc) already accesses the tag array to locate the data being requested in the data array of the cache. Accordingly, in this embodiment, only a marginal amount of additional time is required to retrieve the repair information.
In one embodiment, the bitcells used in the tag array may be larger than the bitcells used in the data array of the cache. When a larger bitcell is used, the bitcell is less susceptible to random doping errors. Further, the voltage used to make changes to the data stored in the tag array may also be higher than the voltages used to make changes to the data stored in the data array. If a larger voltage is used to change the data stored in the tag array (i.e., second memory 120), the larger voltage is more likely to overcome any random doping effects which the bitcell may be subject to.
In another embodiment, the second memory 120 may be a static random access memory (“SRAM”). SRAM is a type of semiconductor memory where the word static indicates that, unlike dynamic RAM (DRAM), it does not need to be periodically refreshed, as SRAM uses bistable latching circuitry to store each bit. SRAM exhibits data remanence, but is still volatile in the conventional sense that data is eventually lost when the memory is not powered.
In yet another embodiment, the second memory 120 may be part of the first memory 110. For example, if the first memory no is a data array in a cache, a portion of the data array (i.e., the second memory 120) could be used for storing repair information.
In other embodiments, the second memory 120 may be a series of flip-flops, a field programmable gate array (“FPGA”), a random access memory (“RAM”) such as a synchronous RAM (“SRAM”), fuses, EEPROMs, eDRAMs or any other type of logic circuit capable of storing data.
As discussed above, the second memory 120 stores repair information. The size and type of the stored repair information can very depending upon the embodiment. For example, the location may indicate multiple lines (2, 3, 4 . . . n, n+1 . . . ), a single line, a portion of a line or by single bit in a line of the first memory no where an error is located. In other embodiments instructions for shifting or correcting bitcell errors may be stored.
In one embodiment, the second memory may store an encoded scheme for defining the location of the errors. For example, if the first memory uses a 512-bit wide line, with 128-bit words (i.e., the line has 4 words), the second memory could use a 2-bit encoding scheme to signify which word in the line contains an error bit. In one exemplary encoding scheme, “01” may indicated an error in the first word, “10” may indicate an error in the second word, “11” may indicate an error in the third word and “00” may indicate an error in the fourth word. One of ordinary skill in the art would recognize that different encoding schemes may be used. Further, the encoding scheme will depend upon how the location of the errors are delineated in the second memory 120 (e.g., by multiple lines, single line, word, bit, etc.) and the size of the first memory 110.
When a request to access, store or remove data in the first memory 110 is received, the controller 130 may retrieve or receive the stored information from the second memory 120.
In one embodiment, the information stored in the second memory 120 may be created at power-up using a built-in test. The controller 130 may attempt to store a series of predetermined or random bits in the first memory 110. The controller 130 can then read the state of the respective bit and compare the read state to an expected state. Based upon the results of the built-in test, the controller may store the repair information in the second memory 120.
In another embodiment the information stored in the second memory 120 may be generated on the fly. If an error occurs while the controller is accessing the first memory 110, the controller 130 can store the location of the error in the second memory. Accordingly, during a subsequent request to access the location where an error bit is located, the system would not suffer any penalties for correcting the error.
In yet another embodiment, if the second memory 120 is a non-volatile memory, the repair information stored in the second memory 120 may be pre-programmed or created once and referenced thereafter. For example, the first memory no may be subjected to a built-in test as described above. However, rather than repeating the test and storing the results each time the memory system 100 is powered up, the results could be stored in a non-volatile memory which would retain the repair information in memory even after the device looses power.
Any combination of the methods for storing information in the second memory 120 may also be used.
The controller 130 and the interface 140 can be used to correct or shift out defective bitcells. In one exemplary embodiment the first memory no may contain a word length redundant column, however, any number of redundant columns may be used. For example, as seen in
In one embodiment the interface 140 may include a series of multiplexors. The controller, based upon the information stored in the second memory 120, may correct or shift the data being read or written into the first memory 110 using the multiplexors as discussed in further detail below.
In other embodiments, single columns, rows, words or any other delineation of the first memory 110 may be designated for redundant bits. One of ordinary skill in the art would recognize that the interface 140 could be modified based upon where the redundant bitcells are located.
In the embodiment illustrated in
While the tag array 220 illustrated in
The cache further includes a decoder 240 and a series of multiplexors (“MUXs”) 242-248 configured as column MUXs. In one embodiment, for example, MUXs 242-248 can be used to give an array a better aspect ratio. Column MUXs 242-248 also may allow a set of sense amplifiers and write circuitry to be shared between multiple bitcell.
In this exemplary embodiment, MUX 250 receives as input the first word and the second word, MUX 252 receives as input the second word and third word and MUX 254 receives as input the third word and fourth word. The decoder 240, based upon the coding scheme, selects which input is selected from MUXs 250-254.
Table 1 illustrates the exemplary encoding/decoding scheme illustrated in
While at least one exemplary embodiment has been presented in the foregoing detailed description, it should be appreciated that a vast number of variations exist. It should also be appreciated that the exemplary embodiment or exemplary embodiments are only examples, and are not intended to limit the scope, applicability, or configuration of the embodiments in any way. Rather, the foregoing detailed description will provide those skilled in the art with a convenient road map for implementing an exemplary embodiment. It being understood that various changes may be made in the function and arrangement of elements described in an exemplary embodiment without departing from the scope of the invention as set forth in the appended claims.