1. Field of Invention
The invention relates to a method and controller for processing data multiplication in a RAID system and, in particular, to a method and controller for processing simultaneously a large amount of data multiplication operations in a RAID system.
2. Related Art
The redundant array of independent disk (RAID) is a disk subsystem designed to enhance access efficiency, to provide better fault-tolerance ability, or both. The RAID utilizes a disk striping technique to enhance the access efficiency. Data are stored separately according to bytes or groups of bytes in many different disk drives, so that the read/write I/O requests can be performed in parallel on many disk drives. On the other hand, a mirroring technique or a disk striping technique with distributive parity data is used to provide the fault-tolerance ability.
The ability of fault tolerance is related to the number of parity data sets stored in the RAID system. Taking RAID5 as an example, it is designed to store an extra set of parity data in addition to the user data. The parity data is usually called the P value, or sometimes the XOR parity because it is the calculation result of XOR operations on the corresponding user data. The formula is:
P=D
0
+D
1
+D
2
+ . . . +D
n-1 (1)
where + represents the XOR operation, P represents the parity data series, D0, D1, D2, . . . , Dn-1 represents the user data series, respectively, and n denotes the number of user data disks. As RAID5 only stores one parity data set, it can only allow one of the user data disks having errors (e.g. damaged or out of order) at a time. The data on the user data disk having errors is recovered using the corresponding P value and the corresponding data on the other normal user data disks by means of the same XOR operations. For example, if D1 has an error, then D1 can be recovered as follows:
D=D
0
+D
2
+ . . . +D
n-1
+P
Considering the fault tolerance demand on more than one user data disk, some systems are designed to store multiple parities. “Reed-Solomon Codes” are usually adopted to set up this type of RAID systems, which allow more than one disk drive having errors. RAID6 belongs to this category. It has at least two parities to allow two or more disk drives having errors at the same time.
Take the RAID6 system with two parities as an example. The two parities are conventionally called P and Q. The formula for computing P is the same as the one in the RAID5 system. The value of Q is obtained using the following formula.
Q=g
0
·D
0
+g
1
·D
1
+g
2
·D
2
+ . . . +g
n-1
·D
n-1 (2)
If two data disks Dx, Dy are damaged, then a careful derivation gives:
D
x
=A·(P+Pxy)+B·(Q+Qxy) (3)
D
y=(P+Pxy)+Dx (4)
A=g
y-x·(gy-x+1)−1 (5)
B=g
−x·(gy-x+1)−1 (6)
P
xy
+D
x
+D
y
=P (7)
Q
xy
±g
x
·D
x
+g
y
·D
y
=Q (8)
Aside from the fact that the power “y−x” is a normal subtraction, the other algebraic operations in Eqs. (2) to (8) are all operations following the rules of the Galois Field. Moreover, g is a generator of the Galois Field. It usually be chosen as g=2.
The addition operation in the Galois Field is in fact the XOR operation. Its multiplication operation is related to the field of GF(2a). For the definitions, properties, and operational rules, please refer to (1) “The mathematics of RAID6”, H. Peter Anvin, December, 2004; and (2) “A Tutorial on Reed-Solomon Coding for Fault-Tolerance in RAID-like Systems”, James S. Plank, Software-Practice & Experience, 27(9), pp 995-1012, September, 1997. Eqs. (1) to (8) given above can be found in Ref. (1).
Since the Galois Field is a closed field and there always exists an r for an arbitrary number X satisfying X=2r, in the prior art looking up table is a typical method to deal with the multiplication operations in the Galois Field (see Ref. (2)). Take GF(2a) as an example. To find the product of any two numbers X and Y, the procedure is as follows:
It is seen from Eqs. (2) and (3) that a large amount of multiplication operations of Galois Field are required for computing Q or recovering the damaged D. In particular, it involves the multiplication of a constant with various different numbers. By means of the conventional method of looking up table, the system has to compute byte by byte and each multiplication operation of Galois Field requires 3 times of table looking up, 1 addition (or subtraction), 1 test and 1 modulo operation. Considering that the sizes of current storage media are frequently tens or hundreds of Giga bytes, such calculations are very inefficient and easy to become the bottleneck of the system. Therefore, how to improve and/or simplify and/or speed up the data multiplication operations in the RAID system is an important issue to be solved in the industry.
It is an objective of the invention to provide an effective algorithm and a controller implementing the algorithm, so that a huge amount of data multiplication operations can be performed simultaneously, thereby improving the efficiency of a RAID system.
In accordance with one feature of the invention, a method for data multiplication operations in a RAID system is provided. The method includes the steps of: generating at least one map table corresponding to at least one value in a field; selecting a length for an XOR operation unit and forming a multiplication unit using a plurality of the XOR operation units; and for the data stored in one disk drive of the RAID, performing at least one XOR operation using the XOR operation unit as one unit while computing on the multiplication unit according to a map table of the at least one map table, and further performing a plurality of the XOR operations to obtain the multiplication result.
In accordance with another feature of the invention, a controller for processing data multiplication operations in the RAID system is provided. The controller includes: a memory for temporarily storing target data provided by a data source; and a central processing circuit that generates at least one map table corresponding to at least one value in a field, performs at least one XOR operation for the target data stored in the memory using the XOR operation unit as one unit while computing on the multiplication unit according to a map table of the at least one map table, and further performs a plurality of the XOR operations to obtain the multiplication result.
In accordance with another feature of the invention, a method for processing data multiplication operations in a RAID system is provided, which is used to compute the product of a number K with a data series X. The method includes the steps of: generating a map table for the number K; selecting a length of an XOR operation unit, and forming a multiplication unit using a plurality of the XOR operation units; dividing the data series X into at least one the multiplication unit; for the multiplication unit and the map table associated with the number K, performing at least one XOR operation using the XOR operation unit as one unit according to the rules in the map table; and performing the multiplication operation in the previous step on all the multiplication units in the data series X. The multiplication result of the number K with the data series X is obtained once all the multiplication operations are done.
These and other features, aspects and advantages of the invention will become apparent by reference to the following description and accompanying drawings which are given by way of illustration only, and thus are not limitative of the invention, and wherein:
The present invention will be apparent from the following detailed description, which proceeds with reference to the accompanying drawings, wherein the same references relate to the same elements.
A feature of the invention is in the appropriate definitions of operational rules for the data in the RAID system in order to speed up the data operations. In practice, the operational rules for the RAID data are commonly taken to be the algebraic rules of the Galois Field. Therefore, the algebraic rules of the Galois Field are used in the following embodiments. One embodiment of the invention is established on the hypothesis of GF(2a) of the Galois Field and its related algebraic rules. As a=8 is a currently-preferred choice in practice, most of the embodiments in this specification assume the domain of the Galois Field to be GF(28). That is, the covered numbers are between 0 and 255. This is because 28 is exactly the amount represented by one byte which is a basic unit of computer memory. The RAID system accordingly established can accommodate up to 255 user data disks, which are sufficient for normal RAID systems. Although the embodiments in this specification assume GF(28), the invention can be applied to other cases with other hypotheses. In other embodiments of the invention, the disclosed technique may be applied in a Galois field domain different from GF(28). Moreover, the invention can also use the operations in other fields or number systems, as long as appropriate operational rules are found in those fields or number systems.
Most of the embodiments described below take a RAID6 system with two parities as the example. However, the invention can be applied to more general cases. Other RAID 6 systems with more than two parities can be implemented with the disclosed method as well. The conventional formulas quoted in the specification are listed as follows.
P=D
0
+D
1
+D
2
+ . . . +D
n-1 (1)
Q=g
0
·D
0
+g
1
·D
1
+g
2
·D
2
+ . . . +g
n-1
·D
n-1 (2)
D
x
=A·(P+Pxy)+B·(Q+Qxy) (3)
D
y=(P+Pxy)+Dx (4)
A=g
y-x·(gy-x+1)−1 (5)
B=g
−x·(gy-x+1)−1 (6)
where P and Q are the two parities in the RAID6 system; x and y are the serial numbers of the two data disks with errors; D, and Dy are the user data corresponding to the two data disks x and y; A and B are constants only related to x and y; and Pxy and Qxy are the values of P and Q when Dx and Dy are both 0, i.e.,
P
xy
+D
x
+D
y
=P (7)
Q
xy
+g
x
·D
x
+g
y
·D
y
=Q (8)
Aside from the fact that the power “y−x” is a normal subtraction, the other algebraic operations in Eqs. (1) to (8) are all operations following the rules of the Galois Field. Moreover, g is a generator of the Galois Field. It usually be chosen as g=2.
The map table is a key ingredient of the invention. It is defined as follows.
Suppose Y, X, and K are numbers in GF(2a). That is, Y, X, and K are all composed of “a” bits. If yi and xi represents the i-th bits of Y and X, respectively, then the vectors Y and X can be represented by:
Let Y=K·X; that is, Y is the multiplication result of K with an arbitrary number X in the Galois Field. Here K is a given constant. Then the map table of K is defined as an a×a matrix MK, whose elements mi,j (0<=i, j<=a−1) are 0 or 1 and satisfy:
In other words,
The addition in the above operations is defined as the XOR operation. Since the elements in the matrix MK are either 0 or 1, the computation of y, can be regarded as follows: the data units xj corresponding to mi,j=1 in the i-th row of the matrix MK are selected to do XOR operations.
The way of generating the map table is closely related to the algebraic rules of the Galois Field. Take GF(28) as an example. Suppose the product of an arbitrary number X and 2 is X□′, then X□′ can be obtained from the following formula (“+” represents an XOR operation):
Suppose the map table of K is a given matrix MK and the map table of K′=2·K is the matrix MK′. Based on the above formula, one can derive the algorithmic rule A for generating MK′ from MK, shown in Table 1:
One algebraic feature of the Galois Field is as follows. Start from K=1 and multiply K each time by 2. The derived new K values do not repeat until covering all the numbers in the Galois Field. Take GF(28) as an example. Start from K=1 and record it. Multiply K by 2 each time. After 255 times recording, the derived K values will cover all the GF(28) numbers (except for 0).
According to the above-mentioned algebraic properties of the Galois Field and the algorithmic rule A, all map tables corresponding to different K values, i.e., all the matrix MK, can be generated. Please refer to
With reference to
A few map tables in GF(28) are listed below for references.
One advantage of using the map tables for the multiplication operations in the Galois Field is to avoid the operations of shifting digits or looking up the log table/inverse log table. All it needs is the XOR operations.
Take GF(28) as an example. Suppose Y is the product of a constant 20 and an arbitrary number X, i.e., Y=20 X, and the map table associated with 20 (the matrix M20) is given as:
According to the definition,
For example, if X=83, then Y=8, as given below:
If the value of Y is computed using the conventional technique by looking up the log table/inverse log table, then
Y=20·83=2206·252=2206+52=2258=2258-255=23=8
which is the same as the result computed by the disclosed technique of the invention.
The disclosed algorithm of the invention allows the operations of a huge amount of Galois Field multiplication to proceed at the same time, particularly the multiplication operations of a constant with a lot of different numbers. Therefore, it speeds up the operations in a RAID system.
Please refer to
How to generate the map tables (step 200) is already described above.
The technique of how to enlarge the XOR operation unit to w bits (step 300) is described as follows.
According to the definition of the map table, i.e. Eq. (9), yi and xi denote the i-th bit of Y and X, respectively, where Y and X are numbers in GF(2a). It implies that when the map tables are used for conventional operations, the XOR operation unit is 1 bit and the multiplication unit is a number in GF(2a). The disclosed method of the invention enlarges the XOR operation unit to w bits, and thus the multiplication unit is enlarged to w a bits. Take GF(28) as an example. If setting w=32, then the XOR operation unit has 32 bits according to the invention, and the multiplication unit has 32·8=256 bits=32 bytes, namely, the set of 32 numbers in GF(28).
One of the chief considerations of selecting the length of the XOR operation unit to be w is the system hardware environment. For example, the consideration could be the operation unit of the CPU or dedicated XOR operation unit or the width of the data bus. If the operation unit of the CPU or dedicated XOR operation unit is 32 bits, then w=32 is an appropriate choice. If the operation unit of the CPU or dedicated XOR operation unit is 64 bits, then setting w=64 is suitable. Of course, it does not imply that the choice of w is necessarily limited to be the same as the length of the operation unit of the CPU or dedicated XOR operation unit. Different w values may be used in other embodiments of the invention.
Another factor influencing the value of w is considering that the basic storage unit (a sector) of the disks had better be an integer multiple of the multiplication unit. Take GF(28) as an example. If setting w=20, then the multiplication unit has 20 8=160 bits=20 bytes. The basic storage unit, i.e. a sector, of the disks usually has 512 bytes. Since 512 is not an integer multiple of 20, therefore additional operations have to be performed when the multiplication unit is incomplete.
After determining the value of w, the system can perform online multiplication of the Galois Field according to the map tables (step 400). The multiplication may be performed for computing a parity or lost user data. The operation rule is still following Eq. (9). However, the XOR operation unit is enlarged to an appropriate w bits, and the multiplication unit is enlarged to w·a bits. That is, both yi and xi have w bits, and both Y and X have w·a bits.
In the following, an embodiment is used to explain the disclosed method. If it is the intention to calculate the product of Y=20·X in GF(28). X is a 32-byte data sector. In the hexadecimal number system, X is represented as follows:
where the 0-th byte of X is denoted by B0, the first byte by B1, the second byte by B2, and so on, until B31.
According to the disclosed method, the RAID system computes and stores all the map tables when its starts. Therefore, the map table associated with 20 is already given. Suppose the system CPU is 32-bit. Therefore, w is set to be 32. In this case, Y and X are considered to be the data series composed of 8 units, given as:
The map table of the constant 20 is already given in Eq. (11). Using Eqs. (9) and (12), one obtains (in the hexadecimal system):
y
0
=x
4
+x
6=(a5 42 78 03)+(01 92 47 86)=(a4 d0 3f 85)
y
1
=x
5
+x
7=(77 25 19 64)+(22 55 9a 76)=(55 70 83 12)
y
2
=x
0
+x
4=(25 2a 1b 33)+(a5 42 78 03)=(80 68 63 30)
y
3
=x
1
+x
4
+x
5
+x
6=(52 6a 11 90)+(a5 42 78 03)+(77 25 19 64)+(01 92 47 86)=(81 9f 37 71)
y
4
=x
0
+x
2
+x
4
+x
5
+x
7=(25 2a 1b 33)+(80 46 7c ab)+(a5 42 78 03)+(77 25 19 64)+(01 92 47 86)=(55 5e 9c 89)
y
5
+x
3
+x
5
+x
6=(52 6a 11 90)+(6e 21 5b 44)+(77 25 19 64)+(01 92 47 86)=(4a fc 14 36)
y
6
=x2+x4+x6+x7=(80 46 7c ab)+(a5 42 78 03)+(01 92 47 86)+(22 55 9a 76)=(06 c3 d9 58)
y
7
=x
3
+x
5
+x
7=(6e 21 5b 44)+(77 25 19 64)+(22 55 9a 76)=(3b 51 d8 56)
Therefore, Y=|a4 d0 3f 85|55 70 83 12|80 68 63 30|81 9f 37 71∥55 5e 9c 89|4a fc 14 36|06 c3 d9 58|3b 51 d8 56|
The above example assumes that the length of X is 32 bytes. If the length of X is greater than 32 bytes, then X is divided into groups each composed of 32 bytes. Each group of 32 byte forms a multiplication unit. Therefore, the product Y can be obtained by repeating the above operations.
Using the disclosed algorithm of the invention on the RAID system, the obtained parity is different from that obtained in the prior art. However, its effect and the way of application are completely the same as the prior art.
For example, suppose D0, D1, and D2 are three disk drives for storing user data, which are 32-byte data series, shown as follows (expressed in the hexadecimal system):
where B0 denotes the 0-th byte, B1 the first byte, and so on, until B31. The RAID6 system comprising the three user data disk drives requires additional two party data disk drives for storing parities P and Q. According to Eqs. (1) and (2), one obtains:
P=D
0
+D
1
+D
2
Q=20·D0+21·D1+22·D2
In the prior art, the values of P and Q in GF(28) are computed as follows:
Using the disclosed method of the invention, the P value is unchanged. The value of Q is as follows (assuming w=32):
Suppose the data in D0 and D2 are damaged, they can be recovered by using D1, P, and Q. Using Eqs. (3), (4), (5), (6), (7), and (8), one obtains:
x=0, y=2, A=166, B=167;
D
0=166·P+167·Q+245·D1
D
2
=P+D
1
+D
0 (13)
1. In the prior art, each byte is computed one by one to solve:
166·P=|fe 9a d2 9a|2d 30 9a ae|05 be be 55|51 34 78 c3∥75 5c bb 70|31 cf d6 8b|82 96 c2 28|5c 97 54 a3
167·Q=|31 57 0f 63|75 a1 13 5d|15 99 82 01|ad 44 89 f9∥f4 c4 49 33|74 57 03 71|75 e1 16a8|ca 2c 2d 10|
245·D1=|e5 db cd cf|08 85 91 95|4c 26 3a 46|c9 0e b7 30∥9b a1 9d 54|1c ed 9d a7|dd 70 83 b9|99 8b 58 83|
Therefore,
D
0=|2a 16 10 36|50 14 18 66|5c 01 06 12|35 7e 46 0a∥1a 39 6f 17|59 75 48 5d|2a 07 57 39|0f 30 21 30|
D2 is then obtained by substituting D0 in Eq. (13).
2. According to the disclosed method of the invention, one obtains:
166·P=|52 4b 78 13|0f 5b 57 7b|4f 32 68 32|33 77 4a 18∥51 3d 58 5d|3e 15 7b 59|7e 76 04 31|44 20 16 39|
167·Q=|08 72 67 3c|36 58 65 22|06 42 1c 57|09 62 56 19∥17 49 00 70|63 52 59 40|42 56 64 36|60 7f 4c 5c|
245·D1=|70 2f 0f 19|69 17 2a 3f|15 71 72 77|0f 6b 5a 0b∥5c 4d 37 3a|04 32 6a 44|16 27 37 3e|2b 6f 7b 55|
Therefore,
D
0=|2a 16 10 36|50 14 18 66|5c 01 06 12|35 7e 46 0a∥1a 39 6f 17|59 75 48 5d|2a 07 57 39|0f 30 21 30|
Likewise, D2 is obtained by substituting D0 in Eq. (13).
The above example reveals that even though the value of Q obtained by the techniques disclosed in the invention is different from the one computed by the prior art, however, the functions of protecting and recovering the user data are identical to the one in the prior art.
Take GF(28) as an example. Suppose Y and X are composed of 8 XOR operation units, each of which has a length of w bits. Here w is an appropriate number, such as 32 in the previous example. Y and X are represented in a vector format as follows,
where yi and xi are w-bit numbers and 0≦i≦7.
Let Y=K·X, where K is a constant whose map table is the matrix MK. Then
Let yi,j and xi,j denote the j-th bits of yi and xi, respectively, where 0≦i≦7 and 0≦j≦w−1. Since both yi and xi have w bits, the above equations can be unfolded as follows:
Analyzing Eqs. (14) to (21), one finds:
That is, (y0,0 y1,0 . . . y7,0) and (x0,0 x1,0 . . . x7,0) satisfy the definition of the map table in Eq. (9). Therefore, the former one is the product of K and the latter one in the Galois Field. Likewise, the numbers of (y0,j y1,j . . . y7,j) for all j satisfying 0≦j≦w−1 are the products of K and (x0,j x1,j . . . x7,1).
The above-mentioned analysis provides a mathematical meaning for the disclosed technique. Please refer to
From the viewpoint of the “equivalent method” mentioned above, the disclosed method of the invention still follows the algebraic principles of the Galois Field. The difference from the prior art is the way of data sampling. That is, the disclosed method of the invention can be regarded as being equivalent to sampling one bit every w bits in the data sector until a GF(2a) number is obtained.
For example, as shown in
In contrast, the prior art samples data in sequence, as shown in
From the equivalent point of view mentioned above, although the invention has different sampling method from the prior art, this does not affect the correctness of Eqs. (2) to (8). Therefore, the functions of data protecting and recovering remain the same.
The above-mentioned “equivalent method” is only for explaining the essence of the invention. In practice, the disclosed method of the invention can simultaneously compute several Galois Field products, thereby increasing the operation speed of the RAID system. This advantage originates from the special data sampling method in the “equivalent method”.
In an embodiment of the invention, the disclosed method is applied to a redundant array of independent disk (RAID) subsystem. With reference to
In this embodiment, the SVC 100 comprises a host-side I/O device interconnect controller 120, a central processing circuit (CPC) 140, a memory 180, and a device-side I/O device interconnect controller 500. Although these components are described using individual functional blocks, in practice some or even all the functional blocks can be integrated on a single chip.
The host-side I/O device interconnect controller 120 is connected to the host 10 and the CPC 140 to be the interface and buffer between the SVC 100 and the host 10. It receives the I/O requests and the related data transmitted from the host 10 and converts and/or maps them to the CPC 140.
The memory 180 is connected to the CPC 140 to be a buffer. It buffers the data transmitted between the host 10 and the PSD array 600 that pass through the CPC 140.
The device-side I/O device interconnect controller 500 is disposed between the CPC 140 and the PSD array 600 to be the interface and buffer between the SVC 100 and the PSD array 600. The device-side I/O device interconnect controller 500 receives the I/O requests and the related data transmitted from the CPC 140 and maps and/or transmits them to the PSD array 600.
The CPC 140 includes a CPU chipset 144 that has a parity engine 160, a central processing unit (CPU) 142, a read only memory (ROM) 146, and a non-volatile random access memory (NVRAM) 148. The CPU 142 can be, for example, a Power PC CPU. The ROM 146 can be a flash memory for storing the basic input/output system (BIOS) and/or other programs. The CPU 142 is coupled via the CPU chipset 144 to other electronic devices (e.g., the memory 180). The NVRAM 148 is used to store information related to the status of I/O operations on the PSD array 600, so that the information can be used as a check when the power is unexpectedly shut down before the I/O operations are finished. The ROM 146, the NVRAM 148, an LCD module 550, and an enclosure management service (EMS) circuit 560 are coupled to the CPU chipset 144 via an X-bus. Moreover, the NVRAM 148 is optional; namely, it may be omitted in other embodiment. Although the CPU chipset 144 is described as a functional block integrated with the parity engine 160, they can be disposed separately on different chips in practice.
In an embodiment of the invention, the target data processed in the multiplication operations may come from the PSD array 600 or the host 10. The multiplication result may be stored in the memory 180, the disk drives of the PSD 600, or the buffer built in the parity engine 160 or the CPU 142 (not shown in the drawing). The algorithm of the invention is implemented by program coding. The program can be stored in the ROM 146 or the memory 180 for the CPC 140 to execute. In other words, the CPC 140 is responsible for generating map tables corresponding to the numbers in a field domain (e.g., GF(28)) during power on or on line. The generated map tables can be stored in the memory 180. In another embodiment of the invention, all the necessary map tables can be computed in advance and stored in the ROM 146 or the memory 180 so that the CPC 140 only needs to access the stored map tables after power on. During the online real-time operations, for a given multiplication unit, an XOR operation unit is taken as an unit to perform an XOR operation on the target data stored in the memory according to the map tables. The multiplication result is then obtained after several the XOR operations.
Although in the above embodiment, disks are taken as an example for the physical storage devices (PSDs) used in the RAID subsystem, it is noted that other kinds of physical storage devices, such as CD, DVD, etc., can be used alternatively, depending on different demands of the market.
The invention being thus described, it will be obvious that the same may be varied in many ways. Such variations are not to be regarded as a departure from the spirit and scope of the invention, and all such modifications as would be obvious to one skilled in the art are intended to be included within the scope of the following claims.
This application is a Divisional application of U.S. patent application Ser. No. 11/513,385, filed on Aug. 31, 2006, which claims the benefit of provisional Application No. 60/596,142, filed on Sep. 2, 2005, the entirety of which are incorporated by reference herein.
Number | Date | Country | |
---|---|---|---|
60596142 | Sep 2005 | US |
Number | Date | Country | |
---|---|---|---|
Parent | 11513385 | Aug 2006 | US |
Child | 14291271 | US |