Balanced Reed-Solomon codes

Information

  • Patent Grant
  • 10425106
  • Patent Number
    10,425,106
  • Date Filed
    Tuesday, November 28, 2017
    7 years ago
  • Date Issued
    Tuesday, September 24, 2019
    5 years ago
Abstract
Balanced Reed-Solomon codes in accordance with embodiments of the invention enable balanced load in distributed storage systems. One embodiment includes a storage controller, wherein the storage processor is configured by the controller Reed-Solomon application to: receive a data segment; partition the data segment into a first block of data and a second block of data; transmit the first block of data to a first node controller in the plurality of node controllers; transmit the second block of data to a second node controller in the plurality of node controllers; wherein the node processor in the first node controller is configured by the node Reed-Solomon application to: receive the first block of data; encode the first block of data using a balanced and sparsest error-correcting code; and store the encoded first block of data in the node memory of the first node controller.
Description
FIELD OF THE INVENTION

The present invention generally relates to error correcting codes and more specifically relates to Reed-Solomon codes.


BACKGROUND

Digital media is being created and stored at a unprecedented rate. Robust error free storage can be achieved with error-correcting codes. Generally, error correcting codes are processes of adding redundant data to a message so that the message can be recovered even when errors are introduced. These errors can be introduced in a number of ways including (but not limited to) during the transmission of the message.


In addition to correcting smaller errors in data, steps can be taken to recover large portions of data when, for example, an entire hard disk is lost. Redundant Array of Independent (or Inexpensive) Disks (RAID) is a common configuration of disk drives that can distribute data such that data set can be rebuilt after the failure of one (or potentially more) individual disks.


SUMMARY OF THE INVENTION

Systems and methods are described balanced Reed-Solomon codes in accordance with embodiments of the invention enable balanced load in distributed storage systems. One embodiment includes a distributed storage node controller, comprising: a network interface; a processor; a memory containing: a Reed-Solomon node application; wherein the processor is configured by the Reed-Solomon node application to: receive a block of data using the network interface: encode the block of data using a balanced and sparsest error-correcting code; and store the encoded block of data in the memory.


In a further embodiment, the block of data is a portion of a data segment divided into a plurality of equally sized blocks of data.


In another embodiment, the error-correcting code is a balanced and sparsest Reed-Solomon code.


In a still further embodiment, the balanced and sparsest Reed-Solomon code further comprises transforming the block of data by a Reed-Solomon generator matrix.


In still another embodiment, the Reed-Solomon generator matrix is calculated by transforming a set of Reed-Solomon codewords with a mask that satisfies balanced and sparsest constraints.


In a yet further embodiment, the mask is selected from rows of a matrix that comprises zeros and ones and satisfies balanced and sparsest constraints.


In yet another embodiment, every row in the matrix has the same weight.


In a further embodiment again, every column in the matrix has a weight that differs by at most one.


In another embodiment again, the Reed-Solomon generator matrix with parameters [6,4] defined over custom character5 can be evaluated by the processor using the following expression:






G
=


[



3


1


4


0


0


0




0


0


0


3


1


4




0


3


1


4


0


0




4


0


0


0


3


1



]

.






wherein G is the Reed-Solomon generator matrix, [6,4] are dimensions of the generator matrix, and custom character5 defines the dimensions of the subspace.


In a further additional embodiment, the Reed-Solomon generator matrix with parameters [15, 10] defined over custom character24 can be evaluated by the processor using the following expression:







G

15
,
10


=


[




α
8




α
9




α
13




α
10



1



α
8



0


0


0


0


0


0


0


0


0




0


0


0


0


0


0



α
8




α
9




α
13




α
10



1



α
8



0


0


0





α
10



1



α
8



0


0


0


0


0


0


0


0


0



α
8




α
9




α
13





0


0


0



α
8




α
9




α
13




α
10



1



α
8



0


0


0


0


0


0




0


0


0


0


0


0


0


0


0



α
8




α
9




α
13




α
10



1



α
8





0



α
8




α
9




α
13




α
10



1



α
8



0


0


0


0


0


0


0


0




0


0


0


0


0


0


0



α
8




α
9




α
13




α
10



1



α
8



0


0





α
13




α
10



1



α
8



0


0


0


0


0


0


0


0


0



α
8




α
9





0


0


0


0



α
8




α
9




α
13




α
10



1



α
8



0


0


0


0


0





α
8



0


0


0


0


0


0


0


0


0



α
8




α
9




α
13




α
10



1



]

.






wherein G is the Reed-Solomon generator matrix, [15, 10] are dimensions of the generator matrix, and custom character24 defines the dimensions of the subspace, and α is a primitive in custom character24.


In another additional embodiment, the Reed-Solomon generator matrix with parameters [14, 10] defined over custom character24 can be evaluated by the processor using the following expression:







G

14
,
10


=

[




α
8




α
9




α
13




α
10



1



α
8



0


0


0


0


0


0


0


0




0


0


0


0


0


0



α
8




α
9




α
13




α
10



1



α
8



0


0





α
10



1



α
8



0


0


0


0


0


0


0


0


0



α
8




α
9





0


0


0



α
8




α
9




α
13




α
10



1



α
8



0


0


0


0


0




0


0


0


0


0


0


0


0


0



α
8




α
9




α
13




α
10



1




0



α
8




α
9




α
13




α
10



1



α
8



0


0


0


0


0


0


0




0


0


0


0


0


0


0



α
8




α
9




α
13




α
10



1



α
8



0





α
13




α
10



1



α
8



0


0


0


0


0


0


0


0


0



α
8





0


0


0


0



α
8




α
9




α
13




α
10



1



α
8



0


0


0


0





α
8



0


0


0


0


0


0


0


0


0



α
8




α
9




α
13




α
10




]






wherein wherein G is the Reed-Solomon generator matrix. [14, 10] are dimensions of the generator matrix, and custom character24 defines the dimensions of the subspace, and α is a primitive in custom character24.


In a still yet further embodiment, A distributed storage network, comprising: a communications network; a storage controller, comprising: a storage network interface; a storage processor; a storage memory containing a storage Reed-Solomon application; a plurality of node controllers, comprising: a node network interface; a node processor; a node memory containing a node Reed-Solomon application; wherein the storage processor is configured by the controller Reed-Solomon application to: receive a data segment; partition the data segment into a first block of data and a second block of data; transmit the first block of data to a first node controller in the plurality of node controllers; transmit the second block of data to a second node controller in the plurality of node controllers; wherein the node processor in the first node controller is configured by the node Reed-Solomon application to: receive the first block of data; encode the first block of data using a balanced and sparsest error-correcting code; and store the encoded first block of data in the node memory of the first node controller.


In still yet another embodiment, the first block of data and the second block of data are equal sized.


In a still further embodiment again, the error-correcting code is a balanced and sparsest Reed-Solomon code.


In still another embodiment again, wherein the balanced and sparsest Reed-Solomon code further comprises transforming the first block of data by a Reed-Solomon generator matrix.


Another further embodiment of the method of the invention includes: the Reed-Solomon generator matrix is calculated by transforming a set of Reed-Solomon codewords with a mask that satisfies the balanced and sparsest constraints.


Still another further embodiment of the method of the invention includes: the mask is selected from rows of a generator matrix that comprises zeros and ones and satisfies further balanced and sparsest constraints.


In a further embodiment, every row in the matrix has the same weight.


In yet another embodiment, every column in the matrix has a weight that differs by at most one.


In another additional embodiment, the storage processor is further configured by the controller Reed-Solomon application to: detect an erasure of the second block of data; retrieve blocks of data including at least the first block of data from the plurality of node controllers; and reconstruct the second block of data using a Reed-Solomon generator matrix and the retrieved blocks of data.


In another embodiment again, A distributed storage method, comprising: running a storage Reed-Solomon application contained in a storage memory using a storage controller, wherein the storage controller has a storage network interface, a storage processor, and a storage memory connected to the processor; receiving a data segment using the storage controller; partitioning the data segment into a first block of data and a second block of data; transmitting the first block of data to a first node controller, wherein the first node controller has a first node network interface, a first node processor, a first node memory connected to the first node processor, and a first node Reed-Solomon application contained in the first node memory; transmitting the second block of data to a second node controller, wherein the second node controller has a second node network interface, a second node processor, a second node memory connected to the second node processor, and a second node Reed-Solomon application contained in the second node memory; receiving the first block of data using the first node controller; encoding the first block of data using a balanced and sparsest error-correcting code and the first node controller; and storing the encoded first block of data in the first node memory using the first node controller.


In a still yet further embodiment, further comprising: detecting an erasure of the second block of data using the storage controller, retrieving blocks of data from a plurality of node controllers including at least the first block of data from the first node controller using the storage controller; and reconstructing the second block of data using a Reed-Solomon generator matrix and the retrieved blocks of data using the storage controller.





BRIEF DESCRIPTION OF THE DRAWINGS


FIG. 1A is a conceptual illustration of an error-correcting code mapping a set of data to a set of parity files.



FIG. 1B is a conceptual illustration of a distributed storage network in which a set of files can be stored on a set of servers.



FIG. 1C is a diagram conceptually illustrating a distributed storage system in accordance with an embodiment of the invention.



FIG. 2A is a block diagram of a Reed-Solomon controller for use in a distributed storage system in accordance with an embodiment of the invention.



FIG. 2B is a block diagram of a Reed-Solomon node controller for use in a distributed storage system in accordance with an embodiment of the invention.



FIG. 3 is a flowchart illustrating a process to construct a generator matrix for a balanced Reed-Solomon code in accordance with an embodiment of the invention.





DETAILED DESCRIPTION

Turning now to the drawings, systems and methods for distributed data storage using sparse and balanced Reed-Solomon codes in accordance with various embodiments of the invention are illustrated. Reed-Solomon codes (RS codes) are a type of classical error-correcting code that are non-binary cyclic error-correcting codes. As a classical example, error-correcting codes can map a set of data to a set of parity files. FIG. 1A illustrates an error-correcting code which maps a set of k data files m1, . . . mk to a set of parity files p1, . . . pn. The data files can be recovered from the n parity files if the number of errors is not too high. This threshold can be determined by the mapping the produces the parity files from the data. FIG. 1B illustrates a distributed storage network in which a collection of files can be stored, where a collection of files m1, m2, and m3 can be stored on a set of servers S1, . . . , S7. Si can store a parity file pi, which can be a function of the files it is connected to. As an illustrative example, server S3 is connected to m2 and m3.


RS codes generally can detect and/or correct multiple symbol errors in a piece of data. RS codes can also be used as erasure codes, which can correct known erasures in data. Additionally, RS codes can be used to correct both errors and erasures. Efficient known encoders and decoders have been developed for RS codes. In several embodiments, RS codes can be created by multiplying input data by a generator matrix.


In distributed data storage, a computer network can store information in more than one node, often replicating the information. These networks typically use error correction techniques to reproduce the data if it becomes damaged or otherwise unavailable. Error correction techniques are generally directed toward configuring individual drives in a Redundant Array of Independent (or Inexpensive) Disks (RAID) configuration. Failures of entire drives within a RAID configuration are typically handled by replicating data across many nodes (drives) which can be inefficient. In contrast, Reed-Solomon processes in accordance with embodiments of the invention can generally allow no and/or fewer replications of data across many nodes. Additionally, when data is stored in a distributed system using cloud computing, individual data storage locations can be physically spread out (i.e., data centers in different locations). This can create additional constraints on where individual portions of data can be stored. As an illustrative example, a constraint can be that individual data centers can only access a specific portion of the data set.


A balanced RS code can evenly distribute computational load. Every code symbol in a balanced RS code can be computed in roughly the same amount of time. Computational load is balanced across the storage system which generally means no storage nodes are bottlenecks. In some embodiments, changing a single message symbol in a balanced RS code requires recomputing the least number of code symbols. A generator matrix for a balanced RS code generally has the number of nonzeros in any two columns differing by at most one. In many embodiments, a balanced RS code can divide up data to be encoded across multiple storage nodes, where each node uses the data available to it (which is generally less than the entire data set) to encode and/or decode the data that it stores. As such, processing load for encoding data can be distributed across a cluster of devices.


A sparse RS code is one in which most of the elements of its generator matrix are zero. Similarly, the sparsest RS code is one in which the rows of its generator matrix has the least possible number of nonzeros. In many embodiments, sparse RS codes do not need to access the entire data set to encode an individual piece of data. The formulation of a balanced and sparsest generator matrix for RS codes in accordance with various embodiments of the invention is discussed below.


Systems and methods for performing distributed data storage using balanced and sparse RS codes to achieve balanced computational load using portions when only portions of the data set are available in accordance with embodiments of the invention are discussed further below.


Distributed Storage System Architectures


Distributed storage systems can store data received from devices on nodes. Controllers can facilitate this storage using a variety of processes including Reed-Solomon codes. Turning now to FIG. 1C, a distributed storage system can store data received from devices in accordance with an embodiment of the invention is shown. The system 160 includes communications network 102. The communications network can be connected to one or more centralized computers 104 and one or more mobile devices 106 using a wired and/or wireless connection. The one or more mobile devices can include (but are not limited to) a cellular telephone, a tablet device, and/or a laptop computer. Additionally, the communications network can be connected to one or more nodes 108 in a data storage system. The nodes can be any of a variety of data storage systems including (but not limited to) sectors within an individual disk drive, separate disk drives within a single physical location, separate drives with multiple physical locations, and/or a combination of storage systems. In various embodiments, data can be distributed on a cloud computing system. In some embodiments, nodes can include any of a variety of hard disk drives including (but not limited to) Parallel Advanced Technology Attachment (PATA), Serial ATA (SATA). Small Computer System Interface (SCSI), and/or Solid State Drives (SSD), but should be readily apparent to one having ordinary skill that any memory device can be utilized as appropriate to specific requirements of embodiments of the invention.


One or more centralized computers and/or one or more mobile devices can transmit data to nodes of the data storage system through the communications network. Similarly, the one or more centralized computers and/or one or more mobile devices can retrieve previously transmitted data from nodes of the data storage system through the communications network. Although many systems are described above with reference to FIG. 1C, any of a variety of systems can be utilized to store data in distributed storage nodes in accordance with various embodiments of the invention. Reed-Solomon controllers and node controllers which can control the distribution of data to nodes in a distributed storage system in accordance with many embodiments of the invention are discussed further below.


A Reed-Solomon controller in accordance with an embodiment of the invention is shown in FIG. 2A. In many embodiments, the Reed-Solomon controller 200 can perform calculations at a control node to determine which portions of data can be distributed to one or more storage nodes within a distributed storage system. The Reed-Solomon controller can include at least one processor 202, an I/O interface 204, and memory 206. The at least one processor, when configured by software stored in memory, can perform calculations to make changes on data passing through the I/O interface as well as data stored in memory. In many embodiments, the memory 206 can include software including Reed-Solomon application 208 as well as data parameters 210, Reed-Solomon code parameters 212, and decoder parameters 214. Data parameters 210 can include (but are not limited to) any of a variety of information relating to input data including the data itself and/or a registry of nodes in the distributed storage system on which blocks of data are distributed. Reed-Solomon code parameters 212 will be discussed below and can include (but are not limited to) the generator matrix, matrix size, and/or codewords. In several embodiments, Reed-Solomon code parameters can be utilized to determine which node in a distributed storage system will store an individual block of data. Decoding data generally requires accessing blocks of data stored in multiple nodes. Decoder parameters 214 can include (but is not limited to) parameters relating to specific node locations of blocks of data and/or parameters relating to decoders themselves. In many embodiments, known decoders can be utilized including (but not limited to) PetersonGorenstein-Zierler. Berlekamp-Massey, Euclidean, and/or discrete Fourier transforms. It should be readily apparent to one having ordinary skill that decoders can be implemented using (but are not limited to) software implementations, hardware implementations, and/or hybrid software and hardware implementations. The Reed-Solomon application 208 can (but is not limited to) control the distribution of blocks of data to one or more nodes and/or the decoding of requested blocks of data that can be retrieved from one or more nodes.


One or more nodes in the distributed storage system can (but is not limited to) encode, store and/or retrieve stored data in a manner controlled at each individual node by a node controller. A node controller in accordance with an embodiment of the invention is shown in FIG. 2B. In several embodiments, the node controller 250 can encode data blocks distributed to the individual node by a centralized Reed-Solomon controller. The node controller can include at least one processor 252, an I/O interface 254, and memory 256. The at least one processor, when configured by software stored in memory, can encode data blocks received through the I/O interface using balanced and sparsest RS codes. In many embodiments, the memory 256 includes a Reed-Solomon node application 258 as well as data parameters 260, Reed-Solomon code parameters 262, encoder parameters 264, and encoded data parameters 266. Data parameters 260 can include (but are not limited to) any of a variety of information relating to blocks of input data which are generally not the full data set. Reed-Solomon code parameters 262 will be discussed below and can include (but are not limited to) the generator matrix, matrix size, and/or codewords. Encoder parameters 264 can include (but are not limited to) parameters relating to RS code encoding. Encoded data parameters can include (but are not limited to) the encoded data itself and/or additional information a node can store about a block of encoded data. It should be readily apparent to one having ordinary skill that encoders can be implemented using (but are not limited to) software implementations, hardware implementations, and/or hybrid software and hardware implementations. The Reed-Solomon node application 258 will be discussed in greater detail below and can transform blocks of data parameters 260 using a balanced and sparsest generator matrix into encoded data parameters.


Although a number of different Reed-Solomon controllers and node controllers are described above with reference to FIGS. 2A and 2B, any of a variety of computing systems can be utilized to control the distribution of data blocks to nodes and the encoding of those data blocks within a distributed storage system as appropriate to the requirements of specific applications in accordance with various embodiments of the invention. Balanced Reed-Solomon processes in accordance with many embodiments of the invention are discussed below.


Balanced Reed-Solomon Processes


Balanced Reed-Solomon processes can construct generator matrices for use in balanced Reed-Solomon codes. Turning now to FIG. 3, a process for constructing a generator matrix satisfying balanced and sparsest constraints in accordance with an embodiment of the invention is illustrated. The process 300 includes generating (302) a matrix using only zeroes and ones that satisfies balanced and sparsest constraints. Mathematical details of balanced and sparsest constraints will be discussed in detail below. A set of Reed-Solomon codewords can be selected (304). A mask can be selected (306) using the rows of the matrix that satisfies the balanced and sparsest constraints. A Reed-Solomon generator matrix can be calculated (308) using the Reed-Solomon codewords and the mask. In many embodiments, a generator matrix calculated using this process will also satisfy the balanced and sparsest constraints. Generator matrices can be utilized to encode data in accordance with many embodiments of the invention. Several applications include encoding data for use in distributed storage systems. Although an overview process for constructing generator matrices for balanced Reed-Solomon codes is described above with reference to FIG. 3, any of a variety of processes for constructing generator matrices that satisfy balanced and sparsest constraints can be utilized as appropriate to the requirements of specific applications in accordance with various embodiments of the invention. Mathematical details regarding the construction of generator matrices in accordance with several embodiments of the invention are discussed further below.


Constructing Balanced Reed-Solomon Generator Matrices


Consider a group of n storage nodes that jointly encode a message vector m∈custom characterqk using an error-correcting code custom character, with generator matrix G∈custom characterqk×n. In particular, every storage node Si encodes the message symbols using gi, the ith column of G, to produce a code symbol ci. The time required to compute ci is a function of the weight—the number of nonzero entries—of gi. If custom character is chosen as a Maximum Distance Separable (MDS) code, it can be argued the average encoding time, over the Si's, is minimized by using a systematic G. If the maximal encoding time can be considered, then systematic encoding is as slow as using a generator matrix that has no zeros.


In many embodiments of the invention, a solution lives between these two extremes. Balanced generator matrices can be considered in which the rows of the generator matrix G have fixed, but tunable, weight and the columns have essentially the same weight. The benefit of a balanced generator matrix G is that every code symbol ci is computed in roughly the same amount of time. This enables the computational load to be balanced across the storage system. i.e. there are no storage nodes that behave as bottlenecks. Furthermore, if each row of G can be fixed to have weight s, then updating a single message symbol generally impacts exactly w storage nodes. When w=d, where d is the minimum distance of the code, a balanced and sparsest generator matrix can be obtained. For G to be balanced, the weight of each column has to be either









wk
n








or









wk
n



.






This is seen from the fact the total number of nonzeros in G is kw, which is to be distributed equally among the n columns.


In several embodiments, a balanced matrix can be defined: A matrix A of size k by n is called w-balanced if the following conditions hold: P(1) Every row of A has the same weight w. P(2) Every column is of weight









wk
n








or















wk
n



.





It is clear that P(2) is equivalent to having the columns differ in weight by at most one. In many embodiments a w-balanced generator matrix for a given cyclic Reed-Solomon code can be shown. In particular, each row is a codeword of weight w, such that d≤w≤n−1.


The allowed value of w is used to specify the sparsity of the target generator matrix, while the results for denser generator matrices are only of theoretical interest, the proof techniques could be of potential use when one is interested in enforcing different types of structure.


Reed-Solomon Codes


In many embodiments, a Reed-Solomon code can be defined as the k-dimensional subspace of custom characterqn given by

RS[n,k]q={(m1), . . . ,mn)):deg(m(x))<k},  (1)

where m(x) is a polynomial over custom characterq of degree deg(m(x)), and the α∈custom characterq are distinct (fixed) field elements. Each message vector m=(m0, . . . , mk−1)∈custom characterqk is mapped to a message polynomial m(x)=Σi=0k−1mixi, which is then evaluated at the n distinct elements {α1, α2 . . . , αn} of custom characterq, known as the defining set of the code. The codeword associated with m(x) is c=(m(α1), . . . , m(αn)), which can be frequently referred to as the evaluation of m(x) at {α1, α2, . . . , αn}. Reed-Solomon codes are MDS codes; their minimum distance attains the Singleton bound, i.e., d(RS[n,k]q)=n−k+1.


Unless otherwise stated, Reed-Solomon codes are considered to be cyclic, whose defining set is chosen as {1, α, . . . , αn−1}, where α is a primitive element in custom characterq. A generator matrix for this code is given by










G
RS

=


[



1


1





1




1


α






α

n
-
1




















1



α

(

k
-
1

)








α


(

n
-
1

)



(

k
-
1

)






]

.





(
2
)








Viewing Reed-Solomon codes through the lens of polynomials allows codewords to be easily characterized with a prescribed set of coordinates required to be equal to 0. It is known that if a degree k−1 polynomial t(x) that vanishes on a prescribed set of k−1 points is interpolated, then t(x) is unique up to multiplication by a scalar. Suppose a minimum weight codeword c∈RS[n,k] is specified for which ci0==cik−2. Let t(x)=Πj=0k−1=(x−αij)=tixi, and form the vector of coefficients of t(x) as t=(t0, t1, . . . , tk−1). The codeword resulting from encoding of t using GRS is a codeword c with zeros in the desired coordinates. Indeed, tGRS is the evaluation of the polynomial t(x) at {1, α, . . . , αn−1}. Since t(x) has {αi1, . . . , αil} as roots, it follows that [tGRS]i1= . . . =[tGRS]il=0. This correspondence between codewords and polynomials will allow the focus to be on the latter when constructing generator matrices with the prescribed structure.


The particular form of t(x), in particular the number of nonzero coefficients, will be used frequently in the various parts of this work. BCH bound can be used to provide this information. BCH bound: Let p(x) be a nonzero polynomial (not divisible by xq−1−1) with coefficients in custom characterq. Suppose p(x) has t (cyclically) consecutive roots, i.e. p(αj+1)= . . . =p(αj+1)=0, where α is primitive in custom characterq Then at least t+1 coefficients of p(x) are nonzero. The BCH bound ensures that all the coefficients of a degree t polynomial with exactly t consecutive roots are nonzero.


Construction of w-Balanced Matrices


This section includes a method that produces a w-balanced matrix in accordance with many embodiments of the invention. Then, it will be shown how this scheme enables the construction of a w-balanced generator matrix for Reed-Solomon codes.


A w-balanced error-correcting code is one that is generated by a matrix obeying the properties of P(1) and P(2) described above. It can be emphasized that not any w-balanced matrix can serve as a mask for a target generator matrix. Suppose that for a choice of parameters k, w and n, an embodiment has n|wk. In this case, one can take A∈{0, 1}k×n as the adjacency matrix of a






(

w
,

wk
n


)





biregular bipartite graph on k left vertices and n right vertices. The following example can demonstrate why this approach could lead to a bad choice. Let n=8, k=5 and w=4, where we are interested in finding a balanced generator matrix for RS [8, 5]. One possible realization of of a matrix A that obeys the conditions of P(1) and P(2) is









A
=


[



1


1


1


1


0


0


0


0




0


0


0


0


1


1


1


1




1


1


1


1


0


0


0


0




0


0


0


0


1


1


1


1




1


1


1


1


0


0


0


0



]

.





(
3
)








Note that the first and last rows are identical, and are of weight 4, which is the minimum distance of RS[8,5]. As alluded to earlier, any two codewords of minimum weight with the same support are scalar multiples of one another. This immediately rules out the possibility of A serving as a mask for a generator matrix of RS[8,5]. Indeed, the distinctness of the rows of A is necessary in this case.


Having shown the necessity of carefully constructing a mask matrix for the sought after generator matrix, the first contribution of this work is to provide a simple process that does this. When the code of interest is RS [n, k], a construction of a w-balanced generator matrix can be presented where n−k+1≤w≤n−1.


Let a be a vector of length n comprised of w consecutive ones followed by n−w consecutive zeros, i.e. a=(1, . . . , 1, 0, . . . 0). In addition, let aj denote the right cyclic shift of a by j positions. In several embodiments, a shift by j≥n is equivalent to one where j is taken modulo n. To simplify notation. (x)n can be used to refer to the x mod n. Furthermore, this notation can be extended to sets by letting {xl, . . . , xl}n denote {(x1)n, . . . , (xl)n}. For example, if n=8 and w=4, then a6=(1,1,0,0,0,0,1,1). Roughly speaking, the desired matrix A is built by setting its first row to a and then choosing the next row by cyclically shifting a by w positions to the right. As mentioned earlier, duplicate rows in A are to be avoided, and the way to do so is formalized below. Let both k and w be strictly less than n. Define the quantities







g
:=

gcd


(

w
,
n

)



,

η
:=

n
g


,

φ
=





k
η








and





ρ

=

k
-

ηφ
.









Define the index sets

custom character1={jw+i:0≤j≤η−1,0≤i≤φ−1}
custom character2={jw+φ:0≤j≤ρ−1},

and custom character=custom charactercustom character2. The matrix A whose rows are given by

{al:l∈custom character}.

satisfies P(1) and P(2). Furthermore, the rows of A are pairwise distinct.


The nature of the construction allows us to identify the columns that are of weight









wk
n



.





The columns of A as obtained above with weight








wk
n







are those indexed by

custom character={φ,φ+1, . . . ,φ+wk−1}n,

where






φ
=





k
n



gcd


(

w
,
n

)





.





Additionally, the construction above provides a remedy to the matrix in (3). Let n=8, k=5 and w=4. The 4-balanced matrix A is









A
=


[



1


1


1


1


0


0


0


0




0


0


0


0


1


1


1


1




0


1


1


1


1


0


0


0




1


0


0


0


0


1


1


1




0


0


1


1


1


1


0


0



]

.





(
4
)







The parameters are g=4, η=2, φ=2 and ρ=1. The index sets are given by custom character1={0,4,1,5} and custom character2={2}. It turns out that this matrix can serve as a mask matrix for a 4-balanced generator matrix of RS[8,5] defined over custom character9. The rows are taken as the evaluations of the following polynomials on {1, α, . . . , α8}, where α generates custom character9x.









p

(
0
)




(
x
)


=




i
=
4

7







(

x
-

α
i


)



,







p

(
4
)




(
x
)


=

2





i
=
4

7







(

x
-

α

i
+
4



)




,







p

(
1
)




(
x
)


=




i
=
4

7







(

x
-

α

i
+
1



)



,







p

(
5
)




(
x
)


=

2





i
=
4

7







(

x
-

α

i
+
5



)




,







p

(
2
)




(
x
)


=




i
=
4

7








(

x
-

α

i
+
2



)

.








The resulting 4-balanced generator matrix is given by






G
=


[




α
3




α
2




α
4



α


0


0


0


0




0


0


0


0



α
3




α
2




α
4



α




0



α
3




α
2




α
4



α


0


0


0




α


0


0


0


0



α
3




α
2




α
4





0


0



α
3




α
2




α
4



α


0


0



]

.






One can check that G is full rank over custom character9 for α whose minimal polynomial over custom character3 is x2+2x+2. The way the evaluation polynomials can be chosen is determined by the set custom character from the construction above.


Balanced Reed-Solomon Codes


In several embodiments, codewords can be selected for the Reed-Solomon generating matrix. p(x)=Πi=47(x−αi) can be fixed. The set of polynomials can be formed:

custom character={pjlx):jlcustom character}  (5)

Now consider corresponding to an arbitrary jlcustom character. This polynomial can be expressed as







p


(


α

-

j
l




x

)


=





i
=
4

7







(



α

-

j
l




x

-

α
i


)


=


α


-
4



j
l








i
=
4

7








(

x
-

α

i
+

j
l




)

.








When evaluated on {1, α, . . . , α8}, this polynomial vanishes on and only on {α4+jl, . . . , α7+jl}. The polynomial p(α−1x) is the annihilator of αl+d, . . . , αl+n−1 if and only if p(x) is the annihilator of αd, . . . , αn−1.


Thus, the coordinates of the corresponding codeword that are equal to 0 are precisely those indexed by {4+jl . . . 7+jl}n, which is in agreement with ajl. Hence, the codewords corresponding to the polynomials in (5) form a 4-balanced generator matrix whose support is determined by A in (4). In various embodiments, these polynomials are linearly independent over the underlying field if and only if the elements of custom character are pairwise distinct and w=n−k+1. Let p(x)=Σi=0zpixicustom characterq[x] and define custom character={p(αjlx)}l=0z. The polynomials in custom character are linearly independent over custom characterq if and only if the elements of {αjl}l=0z are distinct in custom characterq, and pi≠0 for i=0, 1 . . . , z.


In several embodiments, this can provide a tool for constructing d-balanced Reed-Solomon codes, where d is the minimum distance of RS[n,k]. In general, it can give conditions for which a set of z+1 codewords constructed from the same polynomial of degree z are linearly independent: For d=n−k+1, let A be a d-balanced matrix obtained above with index set custom character. Fix p(x)=Πi=dn−1(x−αi) and let custom character={p(α−jlx):jl custom character}. Then, the matrix G whose lth row is the codeword corresponding to p(α−jlx) is a d-balanced generator matrix for RS[n,k]. In several embodiments, this can provide a process to construct what is known as sparsest and balanced Reed-Solomon codes. They are sparsest in the sense that each row of the generator matrix is a minimum distance codeword.


Now suppose that for the same code RS[8, 5], a 6-balanced generator matrix is of interest. In several embodiments, the case when the desired row weight need not be d is attainable with little effort.


Balanced RS codes: For n−k+1≤w≤n−1, let A be a w-balanced matrix obtained above with index set custom character={i0, . . . , ik−l} Fix p(x)=Πi=wn−1(x−αi) and let

custom character1={p−jlx):l=0,1, . . . ,n−w}
custom character2={xl−n+wp−jlx):l=n−w+1, . . . k−1}

Then, the matrix G whose lh row is the codeword corresponding to p(α−jlx) is a w-balanced generator matrix for RS[n, k].


It is possible to verify that the following matrix does indeed generate RS[8,5].






G
=

[




α
5




α
3




α
2




α
3




α
6




α
2



0


0





α
2




α
3




α
6




α
2



0


0



α
5




α
3






α
6




α
2



0


0



α
5




α
3




α
2




α
3





0


0



α
7




α
6




α
6



1



α
4



α




0



α
7




α
7




α
8




α
3



1



α
6



0



]






The matrix G corresponds to the index set custom character={0,6,4,2,1}. The polynomials corresponding to the rows of G are derived from p(x)=(x−α6)(x−α7), and are given by

p(0)(x)=(x−α6)(x−α7),
p(6)(x)=α−4(x−α4)(x−α5),
p(4)(x)=(x−α2)(x−α3)
p(2)(x)=α−4x(x−1)(x−α),
p(1)(x)=α−2x2(x−α)(x−α7).

The fact that high weight codewords are of interest helped to ensure that G is full rank. The codewords chosen correspond to low degree polynomials, which allows one to use the extra degrees of freedom available in constructing the set custom character2. In fact, one can select custom character2 as any set of polynomials whose degrees are all different, and are between n−w+1 and k−1. This will generally guarantee that the resulting generator matrix is full rank albeit not w-balanced. Nonetheless, one can potentially use this technique to enforce other patterns in the structure of G.


In some embodiments, Balanced Reed-Solomon processes can provide a framework for constructing balanced and sparsest generator matrices for error-correcting codes. The fact that each row of G is of minimal weight can imply that when a single message symbol is updated, the least number of code symbols possible need to be modified. This feature can be appealing in the context of distributed storage systems since the number of storage nodes contacted during an update is minimal. As discussed, the balanced property ensures that all storage nodes finish computing their respective code symbols in the same amount of time.


A natural question to ask is whether one can construct a generator matrix that is balanced in the row sense. More precisely, suppose 2≤w≤k can be fixed as the desired column weight of a generator matrix G∈custom characterqk×n the constraint that any two rows of G differ in weight by at most 1 can be enforced. Can G be realized as the generator matrix of some k-dimensional subcode of a Reed-Solomon code? The two requirements imply that each row is of weight equal to









wn
k








or









wn
k



.






In some cases, the techniques in accordance with several embodiments of the invention can be used to construct such generator matrices. In particular, if






wn
k





custom character and







k


n
-

wn
k

+
1


,





processes described above can be used to produce the required mask matrix. Now if







k
<

n
-

wn
k

+
1


,





one needs to resort to a larger RS code, namely







RS


[

n
,

n
-

wn
k

+
1


]


.





Balanced Reed-Solomon Codes for Distributed Storage Processes


In many embodiments, a distributed storage process can encode blocks of data at one or more individual nodes in a distributed storage system. The process can include receiving a block of data at an individual node from a centralized coordinator. This block of data generally is smaller than the entire data set. In several embodiments, each node in the distributed storage system can receive blocks of data that are the same size. A node can encode a data block using a balanced Reed-Solomon encoder with a generator matrix that satisfies balanced and sparsest constraints. By distributing blocks of data of equal size to each node in the distributed storage system, the write commands generally can be load balanced.


The centralized controller can receive a signal requesting stored data. In some embodiments, the centralized controller can determine the location of the individual blocks of the stored data, which can be located at one or more nodes. The location of individual blocks of data can be stored in a variety of ways including (but not limited to) in an index accessible by the centralized controller. A signal can be sent to each node where α block of the data is stored. The centralized decoder can decode the blocks of data and transmit the requested data to the user. It should be readily appreciated by one having ordinary skill that the encoding and/or decoding process described above is merely illustrative and any of a variety of processes using balanced and sparsest Reed-Solomon generator matrices to store blocks of data in one more nodes of a distributed storage system can be utilized as appropriate to various requirements of embodiments of the invention.


Balanced Reed-Solomon Code Illustrative Examples


Illustrative examples of generator matrices of balanced RS codes in accordance with many embodiments of the invention are discussed below.


In several embodiments, the following generator matrix is for a balanced RS code with parameters [6,4] defined over custom character5.






G
=


[



3


1


4


0


0


0




0


0


0


3


1


4




0


3


1


4


0


0




4


0


0


0


3


1



]

.






The computational effort is distributed across the disks so that no bottlenecks are present in the system. The benefit of balanced RS codes is more apparent when the parameters grow. Unlike traditional RAID-6 configurations, a column of the generator matrix of a two erasure-correcting balanced RS code will always be of weight two or three. For the practically-appealing finite field custom character28, this implies that one can construct a balanced generator matrix for a RS code with parameters [255,253], without any degradation in computational performance.


Additionally, in various embodiments, the following generator matrix is for a balanced RS code with parameters [15, 13] defined over custom character24. The underlying finite field allows one to operate on a granularity of half-a-byte.






G
=

[




α
3




α
5



1


0


0


0


0


0


0


0


0


0


0


0


0




0


0


0



α
3




α
5



1


0


0


0


0


0


0


0


0


0




0


0


0


0


0


0



α
3




α
5



1


0


0


0


0


0


0




0


0


0


0


0


0


0


0


0



α
3




α
5



1


0


0


0




0


0


0


0


0


0


0


0


0


0


0


0



α
3




α
5



1




0



α
3




α
5



1


0


0


0


0


0


0


0


0


0


0


0




0


0


0


0



α
3




α
5



1


0


0


0


0


0


0


0


0




0


0


0


0


0


0


0



α
3




α
5



1


0


0


0


0


0




0


0


0


0


0


0


0


0


0


0



α
3




α
5



1


0


0




1


0


0


0


0


0


0


0


0


0


0


0


0



α
3




α
5





0


0



α
3




α
5



1


0


0


0


0


0


0


0


0


0


0




0


0


0


0


0



α
3




α
5



1


0


0


0


0


0


0


0




0


0


0


0


0


0


0


0



α
3




α
5



1


0


0


0


0



]





A [14, 10] RS code can be used to protect data in a large distributed storage system. A balanced generator matrix of an RS code with parameters [15, 10] can be used a as surrogate to build a one for a code with parameters [14, 10]. In the following two generator matrices, the element α is primitive in custom character24.







G

15
,
10


=


[




α
8




α
9




α
13




α
10



1



α
8



0


0


0


0


0


0


0


0


0




0


0


0


0


0


0



α
8




α
9




α
13




α
10



1



α
8



0


0


0





α
10



1



α
8



0


0


0


0


0


0


0


0


0



α
8




α
9




α
13





0


0


0



α
8




α
9




α
13




α
10



1



α
8



0


0


0


0


0


0




0


0


0


0


0


0


0


0


0



α
8




α
9




α
13




α
10



1



α
8





0



α
8




α
9




α
13




α
10



1



α
8



0


0


0


0


0


0


0


0




0


0


0


0


0


0


0



α
8




α
9




α
13




α
10



1



α
8



0


0





α
13




α
10



1



α
8



0


0


0


0


0


0


0


0


0



α
8




α
9





0


0


0


0



α
8




α
9




α
13




α
10



1



α
8



0


0


0


0


0





α
8



0


0


0


0


0


0


0


0


0



α
8




α
9




α
13




α
10



1



]

.





Through a puncturing argument, the previous generator matrix can be converted to one for a [14, 10] code with optimal error-correcting capabilities. For example, eliminating the last column results in such a generator matrix. It should be noted that puncturing results in rows which are of slightly varying weights, although this has no effect on the computational effort exerted by each disk.







G

14
,
10


=

[




α
8




α
9




α
13




α
10



1



α
8



0


0


0


0


0


0


0


0




0


0


0


0


0


0



α
8




α
9




α
13




α
10



1



α
8



0


0





α
10



1



α
8



0


0


0


0


0


0


0


0


0



α
8




α
9





0


0


0



α
8




α
9




α
13




α
10



1



α
8



0


0


0


0


0




0


0


0


0


0


0


0


0


0



α
8




α
9




α
13




α
10



1




0



α
8




α
9




α
13




α
10



1



α
8



0


0


0


0


0


0


0




0


0


0


0


0


0


0



α
8




α
9




α
13




α
10



1



α
8



0





α
13




α
10



1



α
8



0


0


0


0


0


0


0


0


0



α
8





0


0


0


0



α
8




α
9




α
13




α
10



1



α
8



0


0


0


0





α
8



0


0


0


0


0


0


0


0


0



α
8




α
9




α
13




α
10




]





It should be readily appreciated by one having ordinary skill that [6,4] and [14,10] RS codes are merely illustrative examples and any of a variety of RS codes can be constructed as appropriate to specific requirements of many embodiments of the invention.


Additional Balanced RS Code Processes


Balanced RS codes can be utilized by machine learning processes in accordance with many embodiments of the invention. A centralized controller can distribute big machine learning computations to machines. Balanced and sparsest RS codes can be utilized such that the distribution can generally be load balanced. In several embodiments, balanced RS codes can be used to compensate for machines that have not yet reported back with calculations by treating them as erasures. It should be readily apparent to one having ordinary skill how to adapt methods and processes described above to calculate the missing machine learning data using balanced RS codes in accordance with many embodiments of the invention.


Additionally, balanced RS codes in accordance with many embodiments of the invention can be used with MapReduce processes. Balanced RS codes can be used as an addon to the parallel distribution to generate big data sets generally found in MapReduce processes. In several embodiments, if a machine is slow to return data, that machine can be treated as an erasure and a balanced RS code process can be used to construct the data.


Although the present invention has been described in certain specific aspects, many additional modifications and variations would be apparent to those skilled in the art. In particular, any of the various processes described above can be performed in alternative sequences and/or in parallel (on the same or on different computing devices) in order to achieve similar results in a manner that is more appropriate to the requirements of a specific application. It is therefore to be understood that the present invention can be practices otherwise than specifically described without departing from the scope and spirit of the present invention. Thus, embodiments of the present invention should be considered in all respects as illustrative and not restrictive. It will be evident to the person skilled in the art to freely combine several or all of the embodiments discussed here as deemed suitable for a specific application of the invention. Throughout this disclosure, terms like “advantageous,”, “exemplary”, or “preferred” indicate elements or dimensions which are particularly suitable (but not essential) to the invention or an embodiment thereof, and may be modified wherever deemed suitable by the skilled person, except where expressly required. Accordingly, the scope of the invention should be determined not by the embodiments illustrated, but by the appended claims and their equivalents.

Claims
  • 1. A distributed storage system comprising: a plurality of distributed storage nodes configured to store a data file recoverable from a subset of the plurality of distributed storage nodes, wherein each distributed storage node comprises:a network interface;a processor;a memory containing: a Reed-Solomon node application;wherein the processor is configured by the Reed-Solomon node application to: receive a block of data using the network interface;encode the block of data using a balanced and sparsest error-correcting code; and store the encoded block of data in the memory.
  • 2. The distributed storage system of claim 1, wherein the block of data is a portion of a data segment divided into a plurality of equally sized blocks of data.
  • 3. The distributed storage system of claim 1, wherein the error-correcting code is a balanced and sparsest Reed-Solomon code.
  • 4. The distributed storage system of claim 3, wherein the balanced and sparsest Reed-Solomon code further comprises transforming the block of data by a Reed-Solomon generator matrix.
  • 5. The distributed storage system of claim 4, wherein the Reed-Solomon generator matrix is calculated by transforming a set of Reed-Solomon codewords with a mask that satisfies balanced and sparsest constraints.
  • 6. The distributed storage system of claim 5, wherein the mask is selected from rows of a matrix, where the matrix comprises zeros and ones and satisfies balanced and sparsest constraints.
  • 7. The distributed storage system of claim 6, wherein every row in the matrix has the same weight.
  • 8. The distributed storage system of claim 6, wherein every column in the matrix has a weight that differs by at most one.
  • 9. The distributed storage system of claim 4, wherein the Reed-Solomon generator matrix with parameters [6,4] defined over 5 can be evaluated by the processor using the following expression:
  • 10. The distributed storage system of claim 4, wherein the Reed-Solomon generator matrix with parameters [15,10] defined over 24can be evaluated by the processor using the following expression:
  • 11. The distributed storage system of claim 4, wherein the Reed-Solomon generator matrix with parameters [14,10] defined over 24can be evaluated by the processor using the following expression:
  • 12. A distributed storage network, comprising: a communications network; a storage controller, comprising:a storage network interface;a storage processor;a storage memory containing a storage Reed-Solomon application;a plurality of node controllers configured to store a data file recoverable from a subset of the plurality of storage nodes, wherein the plurality of node controllers comprises: a node network interface;a node processor;a node memory containing a node Reed-Solomon application;wherein the storage processor is configured by the controller Reed-Solomon application to: receive a data segment;partition the data segment into at least a first block of data and a second block of data;transmit the first block of data to a first node controller in the plurality of node controllers;transmit the second block of data to a second node controller in the plurality of node controllers;wherein the node processor in the first node controller is configured by the node Reed-Solomon application to:receive the first block of data;encode the first block of data using a balanced and sparsest error-correcting code; andstore the encoded first block of data in the node memory of the first node controller.
  • 13. The distributed storage network of claim 12, wherein the first block of data and the second block of data are equal sized.
  • 14. The distributed storage network of claim 12, wherein the error-correcting code is a balanced and sparsest Reed-Solomon code.
  • 15. The distributed storage network of claim 14, wherein the balanced and sparsest Reed-Solomon code further comprises transforming the first block of data by a Reed-Solomon generator matrix.
  • 16. The distributed storage network of claim 15, wherein the Reed-Solomon generator matrix is calculated by transforming a set of Reed-Solomon codewords with a mask that satisfies the balanced and sparsest constraints.
  • 17. The distributed storage network of claim 16, wherein the mask is selected from rows of a matrix, where the matrix comprises zeros and ones and satisfies further balanced and sparsest constraints.
  • 18. The distributed storage network of claim 17, wherein every row in the matrix has the same weight.
  • 19. The distributed storage network of claim 17, wherein every column in the matrix has a weight that differs by at most one.
  • 20. The distributed storage network of claim 12, wherein the storage processor is further configured by the controller Reed-Solomon application to: detect an erasure of the second block of data;retrieve blocks of data including at least the first block of data from the plurality of node controllers; andreconstruct the second block of data using a Reed-Solomon generator matrix and the retrieved blocks of data.
  • 21. A distributed storage method, comprising: running a storage Reed-Solomon application contained in a storage memory using a storage controller, wherein the storage controller has a storage network interface, a storage processor, and a storage memory connected to the processor;receiving a data segment using the storage controller;storing the data segment among a plurality of node controllers by: partitioning the data segment into a first block of data and a second block of data;transmitting the first block of data to a first node controller, wherein the first node controller has a first node network interface, a first node processor, a first node memory connected to the first node processor, and a first node Reed-Solomon application contained in the first node memory;transmitting the second block of data to a second node controller, wherein the second node controller has a second node network interface, a second node processor, a second node memory connected to the second node processor, and a second node Reed-Solomon application contained in the second node memory;receiving the first block of data using the first node controller;encoding the first block of data using a balanced and sparsest error-correcting code and the first node controller; andstoring the encoded first block of data in the first node memory using the first node controller;detecting an erasure of the second block of data using the storage controller;retrieving blocks of data from a subset of the plurality of node controllers including at least the first block of data from the first node controller using the storage controller; andreconstructing the second block of data using a Reed-Solomon generator matrix and the retrieved blocks of data using the storage controller.
CROSS-REFERENCE TO RELATED APPLICATIONS

The present invention claims priority to U.S. Provisional Patent Application Ser. No. 62/426,821 entitled “Balanced Reed-Solomon Codes” to W. Halbawi et al., filed Nov. 28, 2016. The disclosure of U.S. Provisional Patent Application Ser. No. 62/426,821 is herein incorporated by reference in its entirety.

STATEMENT OF FEDERALLY SPONSORED RESEARCH

This invention was made with government support under Grant No(s). CNS0932428 & CCF1018927 & CCF1423663 & CCF1409204 awarded by the National Science Foundation. The government has certain rights in the invention.

US Referenced Citations (8)
Number Name Date Kind
4633470 Welch et al. Dec 1986 A
5790570 Heegard Aug 1998 A
20070198890 Dholakia Aug 2007 A1
20090019335 Boyer Jan 2009 A1
20120054585 Jiang Mar 2012 A1
20140298129 Wu Oct 2014 A1
20150242484 Zhao Aug 2015 A1
20150317203 Zhou Nov 2015 A1
Non-Patent Literature Citations (39)
Entry
D. J. C. MacKay, “Good error-correcting codes based on very sparse matrices,” in IEEE Transactions on Information Theory, vol. 45, No. 2, pp. 399-431, Mar. 1999.
W. Halbawi, Z. Liu and B. Hassibi, “Balanced Reed-Solomon codes,” 2016 IEEE International Symposium on Information Theory (ISIT), Barcelona, 2016, pp. 935-939.
W. Halbawi, Z. Liu and B. Hassibi, “Balanced Reed-Solomon codes for all parameters,” 2016 IEEE Information Theory Workshop (ITW), Cambridge, 2016, pp. 409-413.
S. H. Dau, W. Song, Z. Dong and C. Yuen, “Balanced Sparsest generator matrices for MDS codes,” 2013 IEEE International Symposium on Information Theory, Istanbul, 2013, pp. 1889-1893.
“Amazon S3 Glacier”, Amazon, Retrieved from: https://aws.amazon.com/glacier, 2017, 11 pgs.
“Archival Cloud Storage: Nearline & Coldline”, GoogleCloud, Retrieved from: https://cloud.google.com/storage/archival, 2017, 9 pgs.
Bandaru et al., “Under the hood: Facebook's cold storage system”, Facebook Code, Retrieved from: https://code.fb.com/production-engineering/-under-the-hood-facebook-s-cold-storage-system-/, May 4, 2015, 7 pgs.
Blaum et al., “EVENODD: an efficient scheme for tolerating double disk failures in RAID architectures”, IEEE Transactions on Computers, vol. 44, Issue 2, Feb. 1995, pp. 192-202.
Blaum et al., “On Lowest Density MDS Codes”, IEEE Transactions on Information Theory, Jan. 1999, vol. 45, No. 1, pp. 46-59.
Chen et al., “RAID: High-performance, Relaible Secondary Storage”, ACM Comput. Surv., Jun. 1994, Vo. 26, No. 2, pp. 145-185.
Dau et al., “Balanced Sparset Generator Matrices for MDS Codes”, IEEE International Symposium on Information Theory, Jan. 22, 2013, pp. 1889-1893, arXiv:1301.5108.
Dau et al., “Constructions of MDS codes via random Vandermonde and Cauchy matrices over small fields”, 53rd Annual Allerton Conference on Communication, Control, and Computing, Monticello, Illinois, Sep. 29-Oct. 2, 2015.
Dau et al., “On Simple Multiple Access Networks”, IEEE Journal on Selected Areas in Communications, Feb. 2015, vol. 33, No. 2, pp. 236-249.
Dau et al., “On the Existence of MDS Codes Over Small Fields With Constrained Generator Matrices”, 2014 IEEE International Symposium on Information Theory (ISIT), Jun. 2014, pp. 1787-1791.
Dimakis et al., “Decentralized Erasure Codes for Distributed Networked Storage”, IEEE/ACM Transactions on Networking (TON)—Special issue on Networking and Information Theory, 2006, vol. 52, No. 6, pp. 2809-2816; arXiv:cs/0606049.
Gabidulin, “Theory of Codes with Maximum Rank Distance”, Problemy Peredachi Informatsii, Jan. 1985, vol. 31, No. 1, pp. 3-16.
Gopalan et al., “On the Locality of Codeword Symbols”, IEEE Transactions on Information Theory, vol. 58, No. 11, Nov. 2012, pp. 6925-6934.
Guruswami et al., “Improved Decoding of Reed-Solomon and Algebraic-Geometry Codes”, IEEE Transactions on Information Theory, Oct. 23, 1999, vol. 45, No. 6, pp. 1757-1767.
Halbawi et al., “Balanced and Sparse Tamo-Barg Codes”, Proceedings of the IEEE International Symposium on Information Theory (ISIT), Aachen, Germany, Jun. 25-30, 2017.
Halbawi et al., “Balanced Reed-Solomon Codes”, International Symposium on Information Theory, Jan. 27, 2016, 11 pgs., arXiv:1601.07283.
Halbawi et al., “Coding with Constraints: Minimum Distance Bounds and Systematic Constructions”, in IEEE International Symposium on Information Theory (ISIT), Jun. 2015, pp. 1302-1306.
Halbawi et al., “Distributed Gabideulin codes for Multiple-Source Network Error Correction”, in IEEE International Symposium on Network Coding (NetCod), Jun. 2014, pp. 1-6.
Halbawi et al., “Distributed Reed-Solomon Codes for Simple Multiple Access Networks”, in 2014 IEEE International Symposium on Information Theory, Jun. 2014, pp. 651-655.
Huang et al., “Pyramid Codes: Flexible Schemes to Trade Space for Access Efficiency in Reliable Data Storage Systems”, Proceedings of the 6th IEEE International Symposium on Network Computing and Applications (NCA 2007), Cambridge, Massachusetts, Jul. 12-14, 2007, 8 pgs.
Kamath et al., “Codes Wth Local Regeneration and Erasure Correction”, IEEE Transactions on Information Theory, vol. 60, No. 8, Aug. 2014, pp. 4637-4660.
Louidor et al., “Lowest-Density MDS Codes over Extension Alphabets”, IEEE Transactions on Information Theory, Aug. 2006, vol. 52, No. 7, pp. 3186-3197.
Massey, “Shift-register synthesis and BCH decoding”, IEEE Transactions on Information Theory, Jan. 1969, vol. 15, No. 1, pp. 122-127.
Muralidhar et al., “f4: Facebook's Warm BLOB storage System”, 11th USENIX Symposium on Operating Systems Design and Implementation (OSDI 14) Oct. 6-8, 2014, pp. 383-398.
Pamies-Juarez et al., “RapidRAID: Pipelined erasure codes for fast data archival in distributed storage systems”, Proc.—IEEE INFOCOM, Aug. 3, 2012, pp. 1294-1302, arXiv:1207.67442.
Papailiopoulos et al., “Locally repairable codes”, Proceedings of the IEEE International Symposium on Information Theory, Cambridge, Massachusetts, Jul. 1-6, 2012, pp. 2771-2775.
Reed et al., “Polynomial codes over certain finite fields”, J. Soc. Ind. Appl Math., 1960.
Sathiamoorthy et al., “XORing Elephants: Novel Erasure Codes for Big Data”, 39th International Conference on Very Large Data Bases, Riva del Garda, Trento, Italy, vol. 6, No. 5, Aug. 26-30, 2013, pp. 325-336.
Silberstein et al., “Optimal Locally Repairable Codes via Rank Metric Codes”, arXiv:1301.6331v1 [cs.IT], Jan. 27, 2013, 6 pgs.
Silva et al., “A Rank-Metric approach to Error Control in Random Network Coding”, IEEE Transactions on Information Theory, Sep. 2008, vol. 54, No. 9, pp. 3951-3967.
Tamo et al., “A Family of Optimal Locally Recoverable Codes”, IEEE Transactions on Information Theory, vol. 60, No. 8, Aug. 2014, pp. 4661-4676.
Tamo et al., “Cyclic LRC codes and their subfield subcodes”, arXiv:1502.01414v1 [cs.IT], Feb. 5, 2015, 6 pgs.
Xu et al., “Low-Density MDS Codes and Factors of Complete Graphs”, IEEE Transactions on Information Theory, vol. 45, No. 6, Sep. 1999, pp. 1817-1826.
Yan et al., “Weakly Secure Data Exchange with Generalized Reed Solomon Codes”, Proceedings of the IEEE International Symposium on Information Theory, Honolulu, Hawaii, Jun. 29-Jul. 4, 2014, pp. 1366-1370.
Yan et al., “Weakly Secure Network Coding for Wireless Cooperative Data Exchange”, IEEE Global Telecommunications Conference—GLOBECOM 2011, Kathmandu, Nepal, Dec. 5-9, 2011.
Related Publications (1)
Number Date Country
20180152204 A1 May 2018 US
Provisional Applications (1)
Number Date Country
62426821 Nov 2016 US