Referring now to the drawings wherein like elements are numbered alike in the several FIGURES:
Exemplary embodiments provide methods and apparatuses for generating a bus error correcting code (ECC) for an m-transfer class of buses, where m is greater than 1 (i.e., the dataword is transferred over two or more bus cycles, with some or all of a different ECC codeword being incorporated into the bus ECC codeword). Exemplary embodiments generate nested, two-bit symbol codes which maintain and/or revise part of an original SEC/DED code and provide timing improvements in the bus transfer of both the newly generated S2EC/D2ED checkbits as well as the original SEC/DED code checkbits.
Exemplary embodiments include a method of constructing a nested error correcting code (ECC) scheme. The method includes receiving a Hamming distance n code including original checkbits. A symbol correcting code H-matrix framework is defined including specifying bit positions for the original checkbits and for additional checkbits associated with a symbol correcting code. The bit positions are specified such that the original checkbits and the additional checkbits are in bit positions that are transferred over a bus in a transfer subsequent to a first transfer. A symbol correcting code H-matrix is created using the bit positions indicated by the framework by iteratively adding rows of H-matrix bits on a symbol column basis such that the symbol correcting code H-matrix describes the symbol correcting code, and the Hamming distance n code is preserved as a subset of the symbol correcting code H-matrix. The data associated with the symbol correcting code may be referred to as the symbol correcting code or as the symbol correcting code codeword.
As is commonly known in the art, the term “Hamming distance” refers to how powerfully an ECC can detect and/or correct errors. A d=3 code can correct all single errors. A d=4 code can correct all single errors while simultaneously detecting all double errors. A d=5 code can correct all double errors. A d=6 code can correct all double errors while simultaneously detecting all triple errors. The concept is further understood to be applicable to symbol-oriented codes, where a symbol is a predefined group of bits in the code stream. Thus, a distance 4 symbol code can correct all single symbol errors while simultaneously detecting all double symbol errors, etc. In general, the terms Single Symbol Correcting (SSC) and Double Symbol Detecting (DSD) would be combined for a distance 4 code and it would be designated, SSC/DSD, and similarly for a distance 4 binary code, the terms Single Error Correcting (SEC) and Double Error Detecting (DED) would be combined and thus the code would be referred to as a SEC/DED code. The additional checkbits generated for the S2EC/D2ED code as well as the original SEC/DED code checkbits are transferred over the second bus transfer cycle.
In exemplary embodiments, the new, bus ECC checkbits are generated on the fly as the SEC/DED ECC word is presented for transfer at the bus interface. The entire SEC/DED ECC word, including its existing checkbits, can be sent across the bus as a subset of the S2EC/D2ED bus code. Thus, in theory, the only logic that needs to be performed is that of generation of the new checkbits (also referred to herein as “additional checkbits”). However, there are cases in which logic must be performed on both the newly generated checkbits as well as the original SEC/DED checkbits. For example, if the system needs to check the data coming out of the memory before it is transferred across the bus to make sure that no memory errors exist, that would prevent the bus ECC from accurately fixing bus errors. Thus, it would be desirable to gain time to allow the manipulation of the original checkbits. Exemplary embodiments gain time by rearranging the checkbits so that all of the new S2EC/D2ED checkbits (i.e., the additional checkbits associated with a symbol correcting code) and all of the original SEC/DED checkbits are sent on the second bus transfer.
The parity code of 4 databits: P1=D1*D2*D3*D4, where the * symbol indicates a Boolean exclusive OR function, is a degenerative H-matrix of 1 1 1 1. A Hamming code of 4 databits can be represented as:
This SEC Hamming code, which is defined as a distance three code (or d=3), can be extended to an SEC/DED extended Hamming code (d=4) by adding one checkbit to the H-matrix as follows:
Now, it is readily seen that C4 is exactly the same as P1. This means that if data is stored with parity, and it is desired to provide SEC/DED protection for its transference over a bus, for example, the parity code can be reused or nested within the extended Hamming code by using the construction shown. However, once one moves beyond parity and Hamming codes, there are no known mathematical constructions that will guarantee code nesting.
A common computer cache memory is one that contains 64-bit datawords that are protected in memory by eight checkbits, thus forming 72-bit ECC words. After the ECC word is taken out of memory, it can be transferred to another unit across a high-speed, 2-transfer data bus. The decision to use a two-transfer bus is made based on overall system architectures and timings, and seems to be a popular choice, although by no means an exclusive choice. If the 72-bit ECC word were just split in two, sent across a 36-bit high-speed bus, and reconstructed on the other side, the system would still be able to correct all single bit errors and detect all double bit errors. However, if a single driver on the bus were to fail, or a single bitlane (e.g., wire) on the bus were to be corrupted, then the system would experience an uncorrectable error. To avoid this, the 72-bit ECC word may be nested within a 76-bit ECC word, which only adds two wires to the bus and allows two-bit-symbol correction to be performed.
Since one of the objects of exemplary embodiments described herein is to save logic circuits and logic delay, exemplary embodiments start with a minimal extended Hamming code. This can be constructed by choosing only odd-weighted columns to be in the SEC/DED extended Hamming code H-matrix. Such a code is called a Hsiao code, and to make it minimal, first the one-weight columns are chosen for the checkbits, then the three-weight columns until they are exhausted, and then the five-weight columns, and then the seven-weight columns. By using only the odd weights and by starting with the smaller weights, exemplary embodiments are achievable to obtain a minimal weight SEC/DED code. Furthermore, it is important to balance the row weights of the H-matrix, as this also affects the logic design and timing, and so a little trial and error can be used on the last couple of higher-order column-weight columns so that each row in the H-matrix is balanced. Thus, a balanced, minimal SEC/DED H-matrix is shown in
It should be noted that for SEC/DED codes, rows can be transposed and/or XOR'd with other rows, and columns can be swapped, without any loss of code efficiency.
In exemplary embodiments, a two-bit-symbol correcting code (example referred to herein is a S2EC/D2ED H-matrix) is constructed that uses the matrix depicted in
Thus, in exemplary embodiments, the original skeleton S2EC/D2ED H-matrix looks like the H-matrix in
It is determined if there are more symbol columns left to be processed at block 510. If all of the symbol columns have been processed, then the S2EC/D2ED H-matrix has been completed and the process is exited at block 512. As described herein, the processing is exited at block 512 when a true S2EC/D2ED code to correct all single two-bit errors and to detect all double two-bit error has been created in the S2EC/D2ED H-matrix. If there are more symbol columns left to be processed, as determined at block 512, then block 514 is performed. At block 514, the current symbol column is set to the previous symbol column (thereby working backwards from the last symbol column to the first symbol column) and iterative processing of each symbol column continues at block 506.
At block 610, the symbol column extension for the current symbol column is incremented to the next binary value. At block 612, it is determined if the incremented value is out of range (implying that all possible combinations have been tried). If all possible combinations have been tried, then processing continues at block 614, where the process described herein in reference to
If “d” does not equal 4, as determined at block 708, then block 712 is performed to determine if all possible extension values have been tried for the first symbol column. If all possible extension values have not been tried, then processing continues at block 704 to try the next highest binary symbol. If all possible extension values have been tried, as determined at block 712, the processing continues at block 714. At block 714, two more checkbits (i.e., two more rows and two more columns) are added to the S2EC/D2ED H-matrix. After the two additional checkbits are added, the whole process is repeated beginning at step 502 in
The examples described herein relate to an H-matrix that is transferred in two bus cycles. It is within the scope of the invention for the H-matrix to be transferred in three or more cycles. Further, the bit positions in the H-matrices depicted herein are exemplary in nature and other bit positions may be implemented by exemplary embodiments.
The examples described herein relate to two-bit error codes. It is within the scope of exemplary embodiments to expand this to three or more bit error codes. The same iterative processing described herein may be utilized to create three bit error codes.
The examples described herein relate to a SEC/DED H-matrix that has eight rows and 72 columns and a S2EC/D2ED H-matrix that has twelve rows and 76 columns. These H-matrices are examples only as the size of the H-matrices will vary based on the number of wires on the bus and the type of error detecting and correcting being performed.
The examples described herein relate to a S2EC/D2ED two bit symbol correcting code. As will be evident to those skilled in the art, other symbol correcting codes, such as a four bit S4EC/D4ED and an eight bit S8EC/D8ED, may be created using the processing described herein.
The examples described herein relate to a Hamming distance of four. As will be evident to those skilled in the art, other Hamming distances can be supported using the processing described herein. For example, if only a SEC code is required, then “d” can be set to three.
The examples described herein relate to a SEC/DED code. As will be evident to those skilled in the art, other Hamming distance n codes may be utilized by exemplary embodiments. For example, the Hamming distance n code may be a double error correction and triple error detection code.
The examples described herein relate to system requirements that all checkbits (symbol correcting code and Hamming distance n code) be transferred over a bus in a second bus transfer. As will be evident to those skilled in the art, other system requirements regarding checkbit and databit placement may be implemented by exemplary embodiments.
The capabilities of the present invention can be implemented in software, firmware, hardware or some combination thereof.
The flow diagrams depicted herein are just examples. There may be many variations to these diagrams or the steps (or operations) described therein without departing from the spirit of the invention. For instance, the steps may be performed in a differing order, or steps may be added, deleted or modified. All of these variations are considered a part of the claimed invention.
While the preferred embodiment to the invention has been described, it will be understood that those skilled in the art, both now and in the future, may make various improvements and enhancements which fall within the scope of the claims which follow. These claims should be construed to maintain the proper protection for the invention first described.
Technical effects and benefits of exemplary embodiments include a structured means of developing symbol correcting codes (e.g., a S2EC/D2ED code) with nested Hamming distance n codes (e.g., a SEC/DED code) that reuse all or part of the Hamming distance n code checkbits as part of the symbol correcting code checkbits. The ability to reuse the logic and circuitry may result in a significant savings in both logic and delay. In addition, the ability to allow both the new S2EC/D2ED checkbits and the original SEC/DED checkbits to be sent on the second transfer may result in greatly improved bus timing by allowing additional time to perform logic operations on the checkbits.
As described above, the embodiments of the invention may be embodied in the form of computer-implemented processes and apparatuses for practicing those processes. Embodiments of the invention may also be embodied in the form of computer program code containing instructions embodied in tangible media, such as floppy diskettes, CD-ROMs, hard drives, or any other computer-readable storage medium, wherein, when the computer program code is loaded into and executed by a computer, the computer becomes an apparatus for practicing the invention. The present invention can also be embodied in the form of computer program code, for example, whether stored in a storage medium, loaded into and/or executed by a computer, or transmitted over some transmission medium, such as over electrical wiring or cabling, through fiber optics, or via electromagnetic radiation, wherein, when the computer program code is loaded into and executed by a computer, the computer becomes an apparatus for practicing the invention. When implemented on a general-purpose microprocessor, the computer program code segments configure the microprocessor to create specific logic circuits.
While the invention has been described with reference to exemplary embodiments, it will be understood by those skilled in the art that various changes may be made and equivalents may be substituted for elements thereof without departing from the scope of the invention. In addition, many modifications may be made to adapt a particular situation or material to the teachings of the invention without departing from the essential scope thereof. Therefore, it is intended that the invention not be limited to the particular embodiment disclosed as the best mode contemplated for carrying out this invention, but that the invention will include all embodiments falling within the scope of the appended claims. Moreover, the use of the terms first, second, etc. do not denote any order or importance, but rather the terms first, second, etc. are used to distinguish one element from another.