Exact repair regenerating codes for distributed storage systems

Information

  • Patent Grant
  • 11513898
  • Patent Number
    11,513,898
  • Date Filed
    Friday, June 19, 2020
    3 years ago
  • Date Issued
    Tuesday, November 29, 2022
    a year ago
Abstract
A distributed storage system includes a plurality of nodes comprising a first node, wherein a total number of nodes in the distributed storage system is represented by n, wherein a file stored in the distributed storage system is recovered from a subset of a number of nodes represented by k upon a file failure on a node in the distributed storage system, and wherein a failed node in the plurality of nodes is recovered from a number of helper nodes of the plurality of nodes represented by d. Upon detecting a failure in the first node, each helper node of the number of helper nodes is configured to determine a repair-encoder matrix, multiply a content matrix by the repair-encoder matrix to obtain a repair matrix, extract each linearly independent column of the repair matrix, and send the linearly independent columns of the repair matrix to the first node.
Description
TECHNICAL FIELD

The disclosure relates to failure recovery for distributed storage systems.


BACKGROUND

The dynamic, large and disparate volume of data garnered from social media, Internet-driven technologies, financial records, and clinical research has created an increasing demand for reliable and scalable storage technologies. Distributed storage systems are widely being used in modern data centers. In distributed storage systems individual storage nodes are often unreliable due to various hardware and software failures. Hence redundancy is introduced to improve the system's reliability in the presence of node failures. The simplest form of redundancy is the replication of the data in multiple storage nodes. Even though it is the most common form of redundancy, replication is very inefficient in terms of the offered reliability gain per cost of the extra storage units required to store the redundancy. In this context, coding techniques have provably achieved orders of magnitude more reliability for the same redundancy compared to replication. Besides reliability offered by storing the redundant data, in order to be durable, it is necessary for a storage system to repair the failed nodes. The repair process consists of downloading (part of) the content of a number of surviving nodes to reconstruct the content of the failed nodes. The conventional erasure codes suffer from high repair-bandwidth due to the total size of data to be downloaded for repair of each failed node. Regenerating codes is a class of erasure codes that have gained popularity in this context, due to their low repair-bandwidth while providing the same level of fault tolerance as erasure codes.





BRIEF DESCRIPTION OF DRAWINGS


FIG. 1 is a conceptual diagram illustrating an example achievable trade-off of the techniques described herein.



FIG. 2 is a conceptual diagram illustrating an example comparison of the maximum files size (F) of two codes, in accordance with the techniques described herein.



FIG. 3 is a conceptual diagram illustrating an example linear trade-off for a system d=4 together with achievable corner points, in accordance with the techniques described herein.



FIG. 4 is a conceptual diagram illustrating an example normalized repair bandwidth (by per-node storage size) for multiple failures with e failed nodes, for an (n, 10,10) determinant code operating at mode m=3, i.e, (α, β, F)=(120,36,990), in accordance with the techniques described herein.



FIG. 5 is a conceptual diagram illustrating an example cascading of determinant codes, in accordance with the techniques described herein.



FIG. 6 is a conceptual diagram illustrating an example symbol injection, in accordance with the techniques described herein.



FIG. 7 is a conceptual diagram illustrating an example hierarchical tree for an (n, k=4, d=6; μ=4) code, in accordance with the techniques described herein.



FIG. 8 is a conceptual diagram illustrating an example participation of the helper nodes in the multiple failure repair of the super-code, in accordance with the techniques described herein.



FIG. 9 is a block diagram of one example of a more detailed view of a node device that may be configured to perform one or more techniques in accordance with the current disclosure.



FIG. 10 is a flow diagram of one or more techniques of the current disclosure.





DETAILED DESCRIPTION

A novel coding scheme for exact repair-regenerating codes is presented in this application. The codes proposed in this application can trade between the repair bandwidth of nodes (number of downloaded symbols from each surviving node in a repair process) and the required storage overhead of the system. These codes work for general system parameters (n, k, d), the total number of nodes, the number of nodes suffice for data recovery, and the number of helper nodes in a repair process, respectively. The proposed construction offers a unified scheme to develop exact-repair regenerating codes for the entire trade-off, including the MBR and MSR points. The new storage-vs.-bandwidth trade-off achieved by the disclosed codes may be optimum. Some other key features of this code include: the construction is linear, the required field size is only Θ(n), and the (unnormalized) code parameters (and in particular sub-packetization level) is at most (d−k+1)k, which is independent of the number of the parity nodes. Moreover, the proposed repair mechanism is helper-independent, that is the data sent from each helper only depends on the identity of the helper and failed nodes, but independent from the identity of other helper nodes participating in the repair process.


While individual storage units in distributed storage systems (DSS) are subject to temporal or permanent failure, the entire system may be designed to avoid losing the stored data. Coding and storing redundant data is an approach to guarantee durability in such systems. Moreover, these systems are equipped with a repair mechanism that allows for a replacement of a failed node. Such replacement can be performed in the functional or exact sense. In functional repair, a failed node may be replaced by another one, so that the consequent family of nodes maintains the data recovery and node-repair properties. In an exact repair process, the content of a failed node may be exactly replicated by the helpers.


Regeneration codes are introduced to manage data recovery and node repair mechanism in DSS. Formally, an (n, k, d) regeneration code with parameters (α, β, F) encodes a file comprised of F symbols (from a finite field custom character) into n segments (nodes) W1, W2, . . . , Wn each of size α, such that two important properties are fulfilled: (1) the entire file can be recovered from every subset of k nodes, and (2) whenever a node fails (become inaccessible), it can be repaired by accessing d remaining nodes and downloading β symbols from each.


It turns out that there is a fundamental trade-off between the minimum required per-node storage α and the repair-bandwidth β, to store a given amount of data F in a DSS. This tradeoff is fully characterized to be achievable by random network coding. However, for the exact-repair problem, which is notably important from the practical perspective, characterization of the trade-off and design of optimum codes are widely open, except for some special cases. Construction of exact-repair regenerating codes for a system with arbitrary parameters (n, k, d) is a complex task due to several combinatorial constraints to be satisfied. The number of such constraints dramatically increases with n, the total number of nodes in the system.


In an (n, k, d) regenerating code, a file comprised of F data symbols, each from a finite field custom character is encoded into n pieces, and each piece may be stored one storage node of capacity α symbols. The stored data in the nodes may maintain two main properties:


1. Data Recovery: By accessing any set of k nodes, the data collector is able to recover the original stored file.


2. Node Repair: In the event of node failure, the content of the failed node can be regenerated by connecting to any subset of custom character nodes of size |custom character|=d, and downloading β symbols from each. The set custom character is called the set of helper nodes.


There is tension between the two properties: while large parity groups are preferred for efficient data recovery, the repair process requires more downloads if the parity groups including each missing symbol are large. This is due to the fact that every symbol, except the missing one, in a parity equation may be downloaded before retrieving the missing one. More formally, it is shown that there is a trade-off between the per-node storage capacity α and the per-node repair-bandwidth β in a storage system that can guarantee the main properties for a file of size F. While it is desired to minimize both α and β, one can be reduced only at the cost of increasing the other.


There are two types of node repairs: (i) functional-repair, where a failed node may be replaced by a new node such that the resulting system continues to satisfy the data collection and node-repair properties. An alternative to function repair is (ii) exact-repair, under which the replaced node stores precisely the same content as the failed node. Hence, exact-repair is a more demanding criterion, and it is expected to require more repair bandwidth in comparison to functional repair, for a given storage size. However, from the practical stand, the exact repair is preferred, since it does not need the extra overhead of updating the system configuration.


The regenerating codes include (n, k, d) distributed storage systems, which are studied using information flow graphs. Moreover, using the cut-set bound, it is shown that the per-node storage capacity α, the per-node repair bandwidth β, and the file size F may satisfy










F





i
=
1

k



min


(

α
,


(

d
-
i
+
1

)


β


)




,




(
1
)








for a storage system that maintains data recovery and node repair properties. This bound implies a trade-off between α and β for a given F. This trade-off is shown to be achievable for the functional repair using random codes. An important follow-up question was whether the same trade-off (1) is achievable with the exact-repair property. First, it was shown that exact-repair regenerating codes can be constructed for the extreme points of the trade-off, namely, the minimum bandwidth regeneration (MBR) referring to the minimum β satisfying (1), and the minimum storage regeneration (MSR), referring to the minimum α for which (1) can be satisfied for a given F. Later, it was shown that some of the interior (between the two extreme) points of the trade-off are not achievable under the exact repair criterion. While the proof does not rule out the possibility of approaching the non-achievable trade-off points with an arbitrary small gap, the next question was whether there is a non-vanishing gap between the trade-off of exact-repair and functional-repair codes. This question was first answered by using the computer-aided approach of information-theoretic inequalities, the techniques described herein completely characterized the trade-off for an (n, k, d)=(4,3,3) systems. Note that (4,3,3) is the smallest system for which there is a non-vanishing gap between functional and exact repair trade-off.


Thereafter, the attention of the data storage community has shifted to characterizing the optimum storage-bandwidth trade-off for the exact-repair regenerating codes. A trade-off characterization consists of two-fold: (i) designing the code constructions that has data recovery and exact node repair properties, achieving pairs of (α, β), and (ii) proving information-theoretic arguments that provide lower bound for the achievable pairs of (α, β). The focus of this disclosure is on code construction part, and hence, a brief review of the existing code constructions is provided. To this end, there are three main categories based on the achievable trade-offs, as follows.


1. The MBR point: This point was fully solved for general (n, k, d), where it is shown that the functional-repair trade-off is also achievable under the exact-repair criterion.


2. The MSR point: Most of the existing code constructions are dedicated to the MSR point. A computer search was carried out to find an (n=5, k=3, d=4) MSR code. The code constructions for general (n, k, d) parameters can be divided into (i) regime of low-rate codes where k/n≤½ and (ii) practically more important regime of high-rate codes where k/n>½. The proposed construction for parameters satisfying d>2k−2 includes two steps: first a code is developed with d′=2k′−2, and then converted to a code with desired (k, d) parameters. The code construction was unified for all parameters d≥2k−2. The code construction for the high rate codes of d<2k−2 remained as an open problem.


It is shown that the exact-repair MSR code (for both low rate and high rate regimes) is achievable in the asymptotic sense, that is when the file size is growing unboundedly. However, the proof was existential and no explicit code construction was provided.


Explicit code construction for the high-rate regime is a more challenging problem and several disclosures have been appeared to improve the state-of-the-art. The code constructions are limited to the repair of only systematic nodes. Another category of code constructions is dedicated to codes with a limited number of parity nodes. In particular, Hadamard matrices are used to construct the first explicit MDS storage code with only 2 parities (n=k+2), that offer an optimum repair-bandwidth for the repair of any single node failure, including the parities.


Sub-packetization level, referring to the unnormalized value of α in terms of the number of symbols, is a practically important parameter of any code construction. While it is preferred to minimize the sub-packetization level, it can be lower than a certain lower bound. A class of MSR codes is introduced which requires only polynomial sub-packetization in k, but a very large field size. Nevertheless, the proposed construction is not fully explicit, and it is limited parameters satisfying n=d−1. The latter restriction was later relaxed, where the same result is shown for an arbitrary number of helper nodes.


Several MSR code constructions for arbitrary parameters (n, k, d) are recently proposed. They are all optimum in terms of the storage vs. repair-bandwidth trade-off, and all achieve the optimum sub-packetization matching the bound. The codes proposed offer a dynamic repair, where the number of helpers is flexible to be varied between k and n−1.


The proposed codes in this work cover the entire trade-off, including the MSR point. In the resulting MSR codes from proposed construction, the code parameters and especially the sub-packetization level do not depend on the total number of nodes n. Also, another advantage of this construction is its flexibility of system expansion by adding new parity nodes.


In contrast to other existing codes in the literature where the entire code needs to be redesigned if the total number of nodes increases, the proposed construction here allows for adding new parity nodes to the system, without changing the content of the other nodes. A comparison between the MSR code obtained from this construction and the existing codes, in terms of required field size and the sub-packetization level is presented in Table 1.









TABLE 1







Comparison between code constructions proposed for the MSR point.











Sub-packetization

Code



level α
Field size q
Parameters





Ex. 1
(n − k)n
q ≥ (n − k)n
n = d + 1


Ex. 2
(n − k)n−1
q ≥ n
n = d + 1


Ex. 3
(d − k + 1)n
q ≥ sn for s =
d < n − 1




lcm(1, 2, . . . , n − k)






Ex. 4





(

n
-
k

)




n

n
-
k












q



(

n
-
k

)





n

n
-
k









n = d + 1





Ex. 5





(

n
-
k

)




n

n
-
k












q



(

n
-
k

)





n

n
-
k









n = d + 1





Ex. 6





(

n
-
k

)




n

n
-
k








q ≥ n
n = d + 1





The techniques
(d − k + 1)k
q ≥ n
All


described herein









3. Interior points: The construction for the interior points (trade-off points except for MBR and MSR) were restricted to the specific system parameters. The optimum code construction for (n=d+1, k=d, d) was presented. The achievable trade-off is shown to be optimum under the assumption of the linearity of the code. However, it wasn't clear that if the same trade-off is achievable for n>d+1. Most of the follow-up efforts to increase the number of parity nodes resulted in compromising the system capacity to construct the code for larger values of the n, and hence their trade-off was diverging from the lower bound, and n increases. The first n-independent achievable trade-off for interior points, where, for any (n, k=d, d) system, the trade-off corner point next to MSR can be achieved. However, the proof is just an existence proof, where a random ensemble of codes are introduced, and it is shown that for any n, there exists at least one code in the ensemble that satisfied both data recovery and node repair properties.


The above-mentioned restriction for n was uplifted as an explicit code construction for the entire trade-off of an (n, k=d, d) storage system. The proposed determinant codes are optimal, and achieve the lower bound, regardless of the total number of nodes. However, the repair process of the determinant codes requires heavy computation, and more importantly, the repair data sent from a helper node to a failed node depends on the identity of the other helpers participating in the repair process. This issue was resolved by introducing a new repair mechanism.


The next set of works focus on breaking the last constraint, i.e., k=d. An explicit code construction was introduced for an (n, k≤d, d) system. The resulting trade-off was improved by the code construction. A class of improved layered code was introduced. However, it turns out that the resulting trade-off in only optimum for the corner point next to the MBR, and only for an (n=d+1, k, d) system. The techniques described herein generalize the parameter d to be any free parameter. However, the construction is dedicated to a trade-off point next to the MBR, implying low repair bandwidth. The result was extended for entire trade-off but only for an (n, k=d−1, d) system. The techniques described herein include a code construction for general (n, k, d) parameter for the entire trade-off with a better resulting trade-off.


Determinant codes are a family of exact repair regenerating codes for a DSS with parameters (n, k=d, d). The main property of these codes is to maintain a constant trade-off between α/F and β/F, regardless of the number of the nodes. In particular, these codes can achieve the lower bound, and hence they are optimum. The determinant codes have a linear structure and can be obtained from the inner product between an encoder matrix and the message matrix. Especially, product-matrix codes for MBR and MSR points can be subsumed from the general construction of the determinant codes.


The repair mechanism proposed for the determinant codes requires a rather heavy computation at the helper nodes in order to prepare their repair symbols to send to the failed node. More importantly, each helper node h∈custom character needs to know the identity of all the other helper nodes participating in the repair process. The assumption of knowing the set of helpers in advance is a limitation of the determinant codes, and it is undesired in real-world systems. In practice, it is preferable that once a request for a repair of a failed node is made, each node can independently decide to whether or not to participate in the repair process and generate the repair data from its content, regardless of the other helper nodes.


On the other hand, besides the repair bandwidth, one of the crucial bottlenecks in the performance of the storage systems is the I/O load, which refers to the amount of data to be read by a helper node to encode for a repair process. While the native constructions for exact repair generating code require a heavy I/O read, the repair-by-transfer (RBT) codes offer an optimum I/O. An elegant modification is proposed to improve the I/O cost of product-matrix MSR codes, by pre-processing the content of the nodes and storing the repair data on non-systematic nodes instead of the original node content. This results in a semi-RBT code: whenever such modified nodes contribute in a repair process, they merely transfer some of their stored symbols without any computation. Such modification could not be applied on the original determinant codes since the repair symbols from a helper node h to a failed node f could be computed only when the set of other helper nodes custom character is identified.


The techniques described herein propose a novel repair mechanism for the determinant codes. In the new repair procedure, data repair symbols from helper node h to a failed node f solely depend on the content of the helper node and the identity of the failed node f. The failed node collects a total of dβ repair symbols from the helper nodes and can reconstruct all of its missing symbols by simple addition and subtraction of some of the received symbols. This simple repair scheme further allows for modifications, to further improve the I/O overhead of the code.


The equations used throughout this disclosure may use lowercase letters to denote numbers (e.g., integers k and d or Galois field numbers v and w), and bold symbols to denote matrices. For positive integers a and b, denote set {a, a+1, . . . , b−1, b} by [a: b], and set {1, 2, . . . , b} by [b]. Hence, for a>b, [a: b]=ø. Script-font letters (e.g., custom character and custom character to denote a set of integer number). Hence x∈custom character implies integer x belongs to the set custom character. The largest entry of a set custom character is denoted by maxcustom character, and the maximum of an empty set is defined as maxø=−∞ for consistency. For two sets custom character and custom character with |custom character|=|custom character| write custom charactercustom charactercustom character to indicate the lexicographical order between custom character and custom character, e.g., {1,2,3}custom character{1,2,4}custom character{1,2,5}custom character{1,3,4}custom character{1,3,5}. For an integer x and a set custom character, define











ind
𝒥

(
x
)

=




"\[LeftBracketingBar]"


{

y



𝒥
:

y

<
x


}



"\[RightBracketingBar]"


.





(
2
)







For a matrix X and a set custom character, denote the sub-matrix of X obtained from the rows in custom character by X[custom character, :].


Note that k and d (with k≤d) are the main system parameters throughout the disclosure. For a fixed k and a subset custom character⊆[d], define custom character=custom character∩[k] and custom character=custom character∩[k+1: d]. Similarly, for a matrix P with d rows, define P and P to be sub-matrices of P obtained from stacking the top k and the bottom (d−k) rows of a matrix P, respectively.


The second contribution of this work is the simultaneous repair for multiple failures. Although single failures are the dominant type of failures in distributed storage systems, multiple simultaneous failures occur rather frequently and need to be handled in order to maintain the system's reliability and fault-tolerance. The naive approach to deal with such failures is to repair each failed node individually and independently from the others. This requires a repair bandwidth from each helper node that scales linearly with the number of failures. There are two types of repair for multiple failures studied: (i) centralized regenerating codes and (ii) cooperative regenerating codes. In centralized regenerating codes, a single data center is responsible for the repair of all failed nodes. More precisely, once a set of e nodes in the system fail, an arbitrary set of d≤n−e nodes are chosen, and βe repair symbols may be downloaded from each helper node. This leads to a total of d·βe symbols which may be used to repair the content of all the failed nodes. The storage-bandwidth trade-off of these codes are studied for two extreme points, namely the minimum storage multi-node repair (MSMR) and the minimum bandwidth multi-node repair (MBMR) points. In particular, a class of MSMR code is introduced, that are capable of repairing any number of failed nodes e≤n−k from any number of helper nodes k≤d≤n−e, using an optimal repair bandwidth. In cooperative regenerating codes upon failure of a node, the replacement node downloads repair data from a subset of d helper nodes. In the case of multiple failures, the replacement nodes not only download repair data from the helper nodes, but also exchange information among themselves before regenerating the lost data, and this exchanged data between them is included in the repair bandwidth. Similar to the centralized case, the trade-off for these codes for the two extreme points, namely the minimum bandwidth cooperative regeneration (MBCR) codes and the minimum storage cooperative regenerating (MSCR) codes are studied. In particular, explicit constructions of MDS codes with optimal cooperative repair for all possible parameters were introduced. Also, they have shown that any MDS code with optimal repair bandwidth under the cooperative model also has optimal bandwidth under the centralized model.


The techniques described herein show that the repair bandwidth required for multiple failures repair in determinant codes can be reduced by exploiting two facts: (i) the overlap between the repair space (linear dependency between the repair symbols) that each helper node sends to the set of failed nodes, and (ii) in the centralized repair, the data center (responsible for the repair process) can perform the repair of the nodes in a sequential manner, and utilize already repaired nodes as helpers for the repair of the remaining failed nodes. Interestingly, using these properties the techniques described herein can limit the maximum (normalized) repair-bandwidth of the helper nodes to a certain fraction of α, regardless of the number of failures. The structure of the code allows us to analyze this overlap, and obtain a closed-form expression for the repair bandwidth. The codes are not restricted only to the extreme points of the trade-off and can operate at any intermediate point on the optimum trade-off. A similar problem is where a class of codes is introduced to operate at the intermediate points of the trade-off, with an improved repair bandwidth for multiple failures. However, this improvement is obtained at the price of degradation of the system's storage capacity as n (the total number of nodes) increases. Consequently, the resulting codes designed for two or more simultaneous failures are sub-optimum, and cannot achieve the optimum trade-off between the per-node capacity, repair bandwidth, and the overall storage capacity. One of the main advantages of the proposed code and repair mechanism is to offer a universal code, which provides a significant reduction in the repair bandwidth for multiple failures, without compromising the system performance.


Throughout the analysis, the techniques described herein frequently need to concatenate several matrices, that is, merging a number of matrices with the same number of rows side-by-side, to form a fat matrix with the same number of rows.


The main contribution of this disclosure is a novel construction for exact-repair regenerating codes, with arbitrary parameters (n, k, d). The following theorem characterizes the achievable storage vs. repair-bandwidth trade-off of the proposed code construction.


The techniques described herein use [k+1: d] to denote the set of integer numbers {k+1, . . . , d}, and [k]=[1: k] to represent the set {1, 2, . . . , k}. For a set custom characterand a member x∈custom character, define custom character(x)=|{y∈custom character:y≤x}|. Boldface symbols refer to matrices, and for a matrix X, denote its i-th row by Xi. The techniques described herein also use the notation X:,j to refer to the j-th column of X. Moreover, use X[custom character,custom character] to denote a sub-matrix of X obtained by rows i∈custom character and columns j∈custom character. Accordingly, X[custom character, :] denotes the sub-matrix of X by stacking rows i ∈ custom character. Moreover, the techniques describe herein may use sets to label rows and/or columns of a matrix, and hence custom character refers to an entry of matrix X at the row indexed by custom character and the column labeled by custom character. Finally, for a set custom character, denote the maximum entry of custom character by maxcustom character.


The optimum storage repair-bandwidth of the exact-repair regenerating codes for an (n, k=d, d) system is a piece-wise linear function, which is fully characterized by its corner (intersection) points. The determinant codes provide a universal construction for all corner points on the optimum trade-off curve. The techniques described herein assign a mode (denoted by m) to each corner point, which is an integer in {1, 2, . . . , d}(from 1 for MBR to d for MSR point). The main distinction between the result of the techniques described herein and previous results is the fact that the repair data sent by one helper node does not depend on the identity of all the other helper nodes participating the repair process. The following definition formalizes this distinction.


Theorem 1: For a distributed storage system with arbitrary parameters (n, k, d), the triple (α, β, F) defined as











α


(

d
,





k
;
μ


)


=




m
=
0

μ





(

d
-
k

)


μ
-
m




(



k




m



)











β


(

d
,





k
;
μ


)


=




m
=
0

μ





(

d
-
k

)


μ
-
m




(




k
-
1






m
-
1




)











F


(

d
,





k
;
μ


)


=





m
=
0

μ





k


(

d
-
k

)



μ
-
m




(



k




m



)



-


(



k





μ
+
1




)

.







(

3

A

)







is achievable for μ∈{1, . . . , k}.


Definition 1: Consider the repair process of a failed node f using a set of helper nodes custom character. The repair process is called helper-independent if the repair data sent by each helper node h∈custom character to the failed node f only depends on f and the content of node h (but not the other helpers participating in the repair process).


The following theorem formally states the trade-off achievable by determinant codes.


Theorem 1: For an (n, k=d, d) distributed storage system and any mode m=1, 2, . . . , d, the triple (α, β, F) with










(


α

(
m
)


,





β

(
m
)


,





F

(
m
)



)

=

(






(



d




m



)

,

(




d
-
1






m
-
1




)

,

m




(




d
+
1






m
+
1




)


)





(

3

B

)







can be achieved under helper-independent exact repair by the code construction proposed in this disclosure. The proposed trade-off is shown in FIG. 1, and is compared against that of other existing codes.



FIG. 1 is a conceptual diagram illustrating the achievable trade-off of the proposed code for (n=8, k=4, d=6), and its comparison to other known codes with these parameters. The proposed trade-off has k=4 corner points, enumerated with μ=1, 2, 3, 4.


The novel repair process presented here has the advantage that the repair data sent by a helper node does not depend on the identity of other helpers participating in the repair process. Moreover, the techniques described herein present a repair mechanism for multiple simultaneous failures. The proposed scheme exploits the overlap between the repair data sent for different failed nodes and offers a reduced repair-bandwidth compared to naively repairing the failed nodes independent of each other.


The code construction is reviewed in this discosure for completeness. In order to prove Theorem 1, it suffices to show that the proposed code satisfies the two fundamental properties, namely data recovery and exact node repair. The proof data recovery property is similar to that of Proposition 1, and hence omitted here. The exact-repair property is formally stated in Proposition 2. Moreover, Proposition 1 shows that the repair bandwidth of the proposed code does not exceed β(m). This is also proved in this disclosure.



FIG. 2 is a conceptual diagram illustrating a comparison of the maximum files size (F) of two codes, in accordance with the techniques described herein. For example, FIG. 2 illustrates a comparison of the maximum files size (F) of two (n≥7, k=4, d=6) exact-regenerating codes, and code parameters (α, β)=(18, 5). When the distributed storage system has only n=d+1=7, both codes can store F=68 units of data. However, for a sufficiently large field size, the storage capacity decays as a function of, while the storage capacity is preserved for the cascade code described herein. In the cascade code construction, the pair (α,β)=(18, 5) corresponds to a code construction with μ=2.



FIG. 2 compares the scalability aspect of the code construction described herein with an existing construction. The term scalability refers to the property that the number of nodes in a distributed storage system can be increased (for a sufficiently large field size) without compromising the system performance, and its overall capacity.



FIG. 3 is a conceptual diagram illustrating the linear trade-off for a system d=4 together with achievable corner points of this disclosure are depicted.


Remark 1 Theorem 1 offers an achievable trade-off for the normalized parameters (α/F, β/F) given by










(



α

(
m
)



F

(
m
)



,






β

(
m
)



F

(
m
)




)

=






(



(



d




m



)

/

m




(




d
+
1






m
+
1




)


,


(




d
-
1






m
-
1




)

/

m




(




d
+
1






m
+
1




)



)

=


(



m
+
1


m


(

d
+
1

)



,


m
+
1


d


(

d
+
1

)




)

.






(

3

C

)







It is shown that for any linear exact-repair regenerating code with parameters (n, k=d, d) that is capable of storing F symbols, (α, β) may satisfy










F




d
+
1



+
2




(








α

+


d


+
1



β


)



,




(

3

D

)







where custom character=└dβ/α┘ takes values in {0, 1, . . . , d}. This establishes a piece-wise linear lower bound curve, with d (normalized) corner points obtained at integer values of custom character=dβ/α. For these corner points, the (normalized) operating points (α/F, β/F) are given







(


α
F

,





β
F


)

=


(




+
1





(

d
+
1

)



,



+
1


d


(

d
+
1

)




)

.





These operating points are matching with the achievable (normalized) pairs given in (2). Therefore, determinant codes are optimal, and together with the lower bound fully characterize the optimum trade-off for exact-repair regenerating codes with parameters (n, k=d, d). The next result of this disclosure provides an achievable bandwidth for multiple repairs.


Theorem 2: In an (n, k=d, d) determinant codes operating at mode m, the content of any set of e simultaneously failed nodes can be exactly repaired by accessing an arbitrary set of d nodes and downloading







β
e

(
m
)


=


(



d




m



)

-

(




d
-
e





m



)






repair symbols from each helper node.


The repair mechanism for multiple failures is similar to that of single failure presented in Proposition 2. In order to prove Theorem 2, it suffices to show that the repair bandwidth required for multiple failures does not exceed βe(m). This is formally stated in Proposition 3 and proved in this disclosure.



FIG. 4 is a conceptual diagram illustrating a normalized repair bandwidth (by per-node storage size) for multiple failures with e failed nodes, for an (n, 10,10) determinant code operating at mode m=3, i.e, (α, β, F)=(120,36,990).


Remark 2: Note that the repair bandwidth proposed for multiple repairs in Theorem 2 subsumes the one in Theorem 1 for single failure for setting e=1:







β
1

(
m
)


=



(



d




m



)

-

(




d
-
1





m



)


=


(




d
-
1






m
-
1




)

=


β

(
m
)


.







Remark 3: It is worth mentioning that the repair-bandwidth proposed in Theorem 2 is universally and simultaneously achievable. That is, the same determinant code can simultaneously achieve βe(m) for every e∈{1, 2, . . . , n−d}.


The next theorem shows that the repair bandwidth for multiple failures can be further reduced in the centralized repair setting, by a sequential repair mechanism, and exploiting the repair symbols contributed by already repaired failed nodes which can act as helpers.


Theorem 3: In an (n, k=d, d) determinant code with (up to a scalar factor) parameters








α

(
m
)


=



(



d




m



)






and






F

(
m
)



=

m


(




d
+
1






m
+
1




)




,





any set of e simultaneously failed nodes can be centrally repaired by accessing an arbitrary set of d helper nodes and downloading a total of











β
¯

e

(
m
)


=


1
d



[


m




(




d
+
1






m
+
1




)

-

m




(




d
-
e
+
1






m
+
1




)


]






(

3

E

)







repair symbols from each helper node.


Remark 4: It is worth noting that for







e
>

d
-
m


,



β
¯

e

(
m
)


=



m
d



(




d
+
1






m
+
1




)


=


F

(
m
)


/

d


(

independent





of





e

)





,
and










β
¯

e

(
m
)



α

(
m
)



=


m


(

d
+
1

)



d


(

m
+
1

)




,




which is strictly less than 1 as shown in FIG. 4 (for all corner points except the MSR point, m=d). The fact that βe(m)=F(m)/d implies that the helper nodes contribute just enough number of repair symbols to be able to recover the entire file, without sending any redundant data. It is clear that this repair-bandwidth is optimum for e≥d, since such a set of e failed nodes may be able to recover the entire file after being repaired.


This theorem is built on the result of Theorem 2, by exploiting the repair data can be exchanged among the failed nodes. Note that in the centralized repair setting, the information exchanged among the failed nodes at the repair center are not counted against the repair bandwidth.


Note that for a given system parameter (n, k, d), a total of k triples of (α, β, F) can be obtained from (3), by changing the parameter μ. The following corollary shows that MBR and MSR points are subsumed as special points of the proposed trade-off.


Corollary 1: The code construction of this disclosure achieves the MBR point for μ=1, and the MSR point for μ=k.


Proof: For the MBR point, corresponding to μ=1,











α
MBR



=
Δ




α


(

d
,

k
;
1


)


=





m
=
0

1





(

d
-
k

)


1
-
m




(



k




m



)



=



(

d
-
k

)

+
k

=
d




,










β
MBR



=
Δ




β


(

d
,

k
;
μ


)


=





m
=
0

1





(

d
-
k

)


1
-
m




(




k
-
1






m
-
1




)



=
1



,






F
MBR



=
Δ




F


(

d
,

k
;
μ


)


=





m
=
0

1





k


(

d
-
k

)



1
-
m




(



k




m



)



=


(



k





1
+
1




)

=


kd
-

(



k




2



)


=



k


(


2

d

-
k
+
1

)


2

.










(
4
)







This triple satisfies








(



α

M

B

R



F

M

B

R



,


β

M

B

R



F

M

B

R




)

=

(



2

d


k


(


2

d

-
k
+
1

)



,

2

k


(


2

d

-
k
+
1

)




)


,





which is the characteristic of the MBR point. Similarly, for μ=k
















α
MSR



=
Δ




α


(

d
,

k
;
μ


)


=





m
=
0

k





(

d
-
k

)


k
-
m




(



k




m



)



=


(

d
-
k
+
1

)

k




,






β
MSR



=
Δ




β


(

d
,

k
;
μ


)


=





m
=
0

k





(

d
-
k

)


k
-
m




(




k
-
1






m
-
1




)



=


(

d
-
k
+
1

)


k
-
1





,






F
MSR



=
Δ




F


(

d
,

k
;
μ


)


=





m
=
0

k





k


(

d
-
k

)



k
-
m




(



k




m



)



=


(



k





k
+
1




)

=


k


(

d
-
k
+
1

)


k





,





(
5
)







which satisfy








(



α

M

S

R



F

M

S

R



,


β

M

S

R



F

M

S

R




)

=

(


1
k

,

1

k


(

d
-
k
+
1

)




)


,





which is the identity of the MSR point.


Corollary 2: The proposed code at mode μ=k−1 achieves the cut-set bound, and hence it is optimum.


Proof: The cut-set bound given by






F





i
=
1

k



min

(

α
,


(

d
-
i
+
1

)


β


)







reduces to F=(k−1)α+(d−k+1)β in the neighborhood of the MSR point. The latter equation is satisfied by the parameters of proposed code:









(

k
-
1

)



α


(

d
,

k
;

k
-
1



)



+


(

d
-
k
+
1

)



β


(

d
,

k
;

k
-
1



)




=



k






α


(

d
,

k
;

k
-
1



)



+


(

d
-
k

)



β


(

d
,

k
;

k
-
1



)



-

[


α


(

d
,

k
;

k
-
1



)


-

β


(

d
,

k
;

k
-
1



)



]


=






m
=
0


k
-
1






k


(

d
-
k

)



k
-
1
-
m




(



k




m



)



+




m
=
0


k
-
1






(

d
-
k

)


k
-
m




(




k
-
1






m
-
1




)



-




m
=
0


k
-
1






(

d
-
k

)


k
-
1
-
m




[


(



k




m



)

-

(




k
-
1






m
-
1




)


]




=






m
=
0


k
-
1






k


(

d
-
k

)



k
-
1
-
m




(



k




m



)



+




m
=
0


k
-
1






(

d
-
k

)


k
-
m




(




k
-
1






m
-
1




)



-




m
=
0


k
-
1






(

d
-
k

)


k
-
1
-
m




(




k
-
1





m



)




=






m
=
0


k
-
1






k


(

d
-
k

)



k
-
1
-
m




(



k




m



)



+




m
=
0


k
-
1






(

d
-
k

)


k
-
m




(




k
-
1






m
-
1




)



-




m
=
1

k





(

d
-
k

)


k
-
m




(




k
-
1






m
-
1




)




=






m
=
0


k
-
1






k


(

d
-
k

)



k
-
1
-
m




(



k




m



)



+



(

d
-
k

)


k
-
0




(




k
-
1






0
-
1




)


-



(

d
-
k

)


k
-
k




(




k
-
1






k
-
1




)



=






m
=
0


k
-
1






k


(

d
-
k

)



k
-
1
-
m




(



k




m



)



-
1

=


F


(

d
,

k
;

k
-
1



)


.











Hence, this point satisfies the cut-set bound and it is optimum.


The code construction presented in this disclosure uses signed determinant codes as the main building blocks. The family of determinant codes is a class of codes that achieve the optimum (linear) storage-bandwidth tradeoff of the regenerating codes, when the number nodes participating in data recovery equals the number of helpers contributing to a repair process, i.e., k=d. They are linear codes, which can be constructed by multiplying an encoder matrix by a message matrix. The modification here (that converts a determinant code to a signed determinant code) is due arbitrary assignment of (+/−) signs to the rows of the message matrix, which affect all the entries in the corresponding row. The above modification preserves all properties of determinant codes, while it is helpful towards the next step which is the construction of (n, k, d) codes.


The following reviews the construction of signed determinant codes and their properties. The family of signed determinant codes for an (n, k=d, d) system consists of d distinct codes, enumerated by a parameter m∈{1, 2, . . . , d}, which is called mode of the code. For any mode m∈[d], the parameters of the determinant code corresponding to the m-th corner point on the trade-off are given by







(


α
m

,

β
m

,

F
m


)

=


(






(



d




m



)

,

(




d
-
1






m
-
1




)

,

m




(




d
+
1






m
+
1




)


)

.





Here m=1 corresponds to MBR code, while m=d results in the parameters of an MSR code. For a distributed storage system with parameters (n, k=d, d) and corresponding to a mode m∈{1, 2, . . . , d}, the construction provides an exact-repair regenerating code with per-node storage capacity







α

(
m
)


=

(



d




m



)






and per-node repair-bandwidth







β

(
m
)


=


(




d
-
1






m
-
1




)

.






This code can store up to







F

(
m
)


=

m


(




d
+
1






m
+
1




)







symbols.


The coded symbols in a matrix is represented by custom charactern×a, in which the i-th row corresponds to the encoded data to be stored in i-th node of DSS. The proposed code is linear, i.e., the encoded matrix custom character is obtained by multiplying an encoder matrix Ψn×d and a message matrix. All entries of the encoder matrix and the message matrix are assumed to be from a finite field custom character, which has at least n distinct elements. Moreover, all the arithmetic operations are performed with respect to the underlying finite field. The structures of the encoder and message matrices are given below.


A signed determinant code with parameters (n, k, d) and mode m is represented by a matrix Cn×αm whose i-th row includes the coded content of the i-th node. In general, Cn×αm is obtained by multiplying an encoder matrix Ψn×d whose entries are from a finite field custom characterq and a message matrix Dd×αm. The encoder matrix Ψ is chosen such that any collection of d of its rows is linearly independent. Examples of such matrices include Vandermonde and Cauchy matrices.


Encoder Matrix: The matrix Ψn×d is a fixed matrix which is shared among all the nodes in the system. The main property required for matrix Ψ is being Maximum-Distance-Separable (MDS), that is, any d×d sub-matrix of Ψn×d is full-rank. Examples of MDS matrices include Vandermonde or Cauchy matrices. An MDS matrix may be converted to a systematic MDS matrix, by multiplying it by the inverse of its top d×d sub-matrix. The first k nodes may be referred to by systematic nodes if a systematic MDS matrix is used for encoding.


Message Matrix: The message matrix D is filled with raw (source) symbols and parity symbols. Recall that D is a d×α matrix, that has







d

α

=

d


(



d




m



)







entries, while storing only






F
=

m


(




d
+
1






m
+
1




)







source symbols. Hence, there are








d





α

-
F

=

(



d





m
+
1




)






redundant entries in D, which are filled with parity symbols. More precisely, the techniques described herein divide the set of F data symbols into two groups, namely, custom character and custom character, whose elements are indexed by sets as follows










V
=

{




v

x
,
𝒳


:
𝒳



[
d
]


,




"\[LeftBracketingBar]"

𝒳


"\[RightBracketingBar]"


=
m

,

x

𝒳


}


,


W
=


{




w

x
,
𝒴


:
𝒴



[
d
]


,




"\[LeftBracketingBar]"

𝒴


"\[RightBracketingBar]"


=

m
+
1


,

x

𝒴

,


in



d
𝒴

(
x
)



m


}

.






(
6
)







Note that each element of custom character is indexed by a set custom character⊆[d] of length |custom character|=m and an integer number x∈custom character. Hence,








V


=


m


(



d




m



)


.






Similarly, symbols in custom character are indexed by a pair (x, custom character), where custom character is a subset of [d] with m+1 entries, and x can take any value in custom character except the largest one. So, there are








W


=

m


(



d





m
+
1




)







symbols in set custom character. Note that F=|custom character|+|custom character|.









V


+


W



=



m


(



d




m



)


+

m


(



d





m
+
1




)



=


m


(




d
+
1






m
+
1




)


=


F
m

.







Moreover, for (x, custom character) with custom character(x)=m+1, compute a parity symbol custom character, so that the parity equation













y

𝒴





(

-
1

)



ind
𝒴

(
y
)




w

y
,
𝒴




=
0




(
7
)







hold for any custom character⊆[d] with |custom character|=m+1. In other words, such missing symbols are given by












(

-
1

)


m
+
1




w


max

𝒥

,
𝒥



=

-




y


𝒥

\


{

max

𝒥

}







(

-
1

)



ind
𝒥

(
y
)





w

y
,
𝒥


.














Finally, for a arbitrary given signature vector σD: [d]→custom character, a plus or minus sign may be assigned to each integer x∈[d], that is, (−1)σD(x). To fill the data matrix D, label its rows by integer numbers from [d] and its columns by subsets custom character⊆[d] with |custom character|=m, according to the lexicographical order. Then the entry at row x and column custom character is given by










D

x
,
𝒥


=

{






(

-
1

)



σ
D



(
x
)





v

x
,
𝒥








if





x


𝒥

,








(

-
1

)



σ
D



(
x
)





w

x
,

𝒥


{
x
}









if





x



𝒥
.










(

8

A

)







For the sake of completeness, define an (n, k=d, d; m=0) determinant code at mode m=0 to be a trivial code with (α=1, β=0, F=0), whose message matrix is a d×1 all-zero matrix.


The second important property of the proposed code is its ability to exactly repair the content of a failed node using the repair data sent by the helper nodes. Let node f∈[n] fails, and a set of helper nodes custom character⊆{1, 2, . . . , n}\{f} with |custom character|=d wishes to repair node f. The techniques described herein first determine the repair data sent from each helper node in order to repair node f.


Repair Encoder Matrix at the Helper Nodes: For a determinant code operating in mode m and a failed node f, the repair-encoder matrix Ξf,(m) is defined as a







(



d




m



)

×

(



d





m
-
1




)






matrix, whose rows are labeled by m-element subsets of [d] and columns are labeled by (m−1)-element subsets of [d]. The entry in row custom character and column custom character is given by









=

{








if








{
x
}


=

,





0



otherwise
,









(

8

B

)







where ψf,x is the entry of the encoder matrix Ψ at position (f, x). An example of the matrix Ξ is given in (29).


In order to repair node f, each helper node h ∈ custom character multiplies its content Ψh·D by the repair-encoder matrix of node f to obtain Ψh·D·Ξf,(m), and sends it to node f. Note that matrix Ξf,(m) has








(



d





m
-
1




)






columns, and hence the length of the repair data Ψh·D·Ξf,(m) is









(



d





m
-
1




)

,






which is greater than






β
=


(




d
-
1






m
-
1




)

.






However, the following proposition states that out of








(



d





m
-
1




)






columns of matrix Ξf,(m) at most







β

(
m
)


=

(




d
-
1






m
-
1




)






are linearly independent. Thus, the entire vector Ψh·D·Ξf,(m) can be sent by communicating at most β symbols (corresponding to the linearly independent columns of Ξf,(m)) to the failed node, and other symbols can be reconstructed using the linear dependencies among the repair symbols. This is formally stated in the following proposition, which is proved in this disclosure.


Proposition 1: The rank of matrix Ξf,(m) is at most







β

(
m
)


=


(




d
-
1






m
-
1




)

.





Decoding at the Failed Node: Upon receiving d repair-data vectors {Ψh·D·Ξf,(m): h∈custom character}, the failed node stacks them to form a matrix Ψ[custom character, :]·D·Ξf,(m), where Ψ[custom character, :] in the sub-matrix of Ψ obtained from nodes h∈custom character. This matrix is full-rank by the definition of the Ψ matrix. Multiplying by Ψ[custom character, :]−1, the failed node retrieves

Rf,(m)=D·Ξf,(m)  (8C)


This is a






d
×

(



d





m
-
1




)






matrix. These






d


(



d





m
-
1




)






linear combinations of the data symbols span a linear subspace, which is referred to by repair space of node f. The following proposition shows that all of the missing symbols of node f can be recovered from its repair space.


Proposition 2: In the (n, k=d, d) proposed codes with parameters








(


α

(
m
)


,

β

(
m
)


,

F

(
m
)



)

=

(






(



d




m



)

,

(




d
-
1






m
-
1




)

,

m




(




d
+
1






m
+
1




)


)


,





for every failed node f ∈ [n] and set of helpers custom character⊆[n]\{f} with |H|=d, the content of node f can be exactly regenerated by downloading β symbols from each of nodes in custom character. More precisely, the custom character-th entry of the node f can be recovered using











[


Ψ
f

·
D

]

𝒥






x

𝒥







(

-
1

)



ind
𝒥

(
x
)


[

R

f
,

(
m
)



]


x
,

𝒥


{
x
}




.






(

8

D

)







Remark 1: Note that for a code defined on the Galois field GF(2S) with characteristic 2, −1=+1, and hence, all the positive and negative signs disappear. In particular, the parity equation in (5) may simply reduce to










y

𝒥



w

y
,
𝒥



=
0.




the non-zero entries of the repair encoder matrix in (7) may be ψf,x, and the repair equation in (9) may be replaced by








[


Ψ
f

·
D

]

𝒥






x

𝒥





[

R

f
,

(
m
)



]


x
,

𝒥


{
x
}




.






The repair mechanism proposed for multiple failure scenario is similar to that of the single failure case. The techniques described herein consider a set of failed nodes custom character with e=|custom character| failures. Each helper node h∈custom character sends its repair data to all failed nodes simultaneously. Each failed node f∈custom charactercustom character can recover the repair data {Ψh·D·Ξf,(m):h∈custom character}, and the repair mechanism is similar to that explained in Proposition 2.


A naive approach is to simply concatenate all the required repair data {Ψh·D·Ξf,(m):f∈custom character} at the helper node h∈custom character and send it to the failed nodes. More precisely, for a set of failed nodes custom character={f1, f2, . . . , fe} and a helper node h∈custom character, define its repair data as








Ψ
h

·
D
·

Ξ

ε
,

(
m
)




,




where










Ξ

ε
,

(
m
)



=


[




Ξ


f
1

,

(
m
)






Ξ


f
2

,

(
m
)









Ξ


f
e

,

(
m
)






]

.





(

8

E

)







This is simply a concatenation of the repair data for individual repair of f∈custom character, and the content of each failed node can be exactly reconstructed according to Proposition 2. The repair bandwidth required for naive concatenation scheme is







e
×

β
1

(
m
)



=


e


(




d
-
1






m
-
1




)


.






Instead, the techniques described herein show that the bandwidth can be opportunistically utilized by exploiting the intersection between the repair space of the different failed nodes. The following proposition shows that the repair data







Ψ
h

·
D
·



,

(
m
)







can be delivered to the failed nodes by communicating only βe(m) repair symbols.


Proposition 3: Assume that a family of e nodes custom character={f1, f2, . . . , fe} are failed. Then the rank of matrix custom character defined in (10) is at most







(



d




m



)

-


(




d
-
e





m



)

.





Before presenting the formal proof of the main properties of the proposed code, the techniques described herein show the code construction and the repair mechanism through an example in this section. This example is similar to that of, and may be helpful to understand the notation and the details of the code construction, as well as to provide an intuitive justification for its underlying properties.


Let's consider a distributed storage system with parameters (n, k, d)=(8,4,4) and an operating mode m=2. The parameters of the proposed regeneration code for this point of the trade-off are given by







(


α

(
2
)


,

β

(
2
)


,

F

(
2
)



)

=


(


(



4




2



)

,

(




4
-
1






2
-
1




)

,

2


(




4
+
1






2
+
1




)



)

=


(

6
,
3
,
20

)

.






The techniques described herein first label and partition the information symbols into two groups, custom character and custom character, with








V


=


m


(



d




m



)


=


2


(



4




2



)


=

12





and











W


=


m




(



d





m
+
1




)

=


2


(



4




3



)


=
8.







Note that |custom character|+|custom character|=20=F.







V
=

{





v

1
,

{

1
,
2

}



,

v

2
,

{

1
,
2

}



,

v

1
,

{

1
,
3

}



,

v

3
,

{

1
,
3

}



,







v

1
,

{

1
,
4

}



,

v

4
,

{

1
,
4

}



,

v

2
,

{

2
,
3

}



,

v

3
,

{

2
,
3

}



,







v

2
,

{

2
,
3

}



,

v

4
,

{

2
,
4

}



,

v

3
,

{

3
,
4

}



,

v

4
,

{

3
,
4

}







}


,





W
=


{





w

1
,

{

1
,
2
,
3

}



,

w

2
,

{

1
,
2
,
3

}



,

w

1
,

{

1
,
2
,
4

}



,

w

2
,

{

1
,
2
,
4

}



,







w

1
,

{

1
,
3
,
4

}



,

w

3
,

{

1
,
3
,
4

}



,

w

2
,

{

2
,
3
,
4

}



,

w

3
,

{

2
,
3
,
4

}







}

.






Moreover, for each subset custom character⊆[4] with |custom character|=m+1=3, define parity symbols as









{





=



{

1
,
2
,
3

}



:







w

3
,

{

1
,
2
,
3

}




=


w

2
,

{

1
,
2
,
3

}



-

w

1
,

{

1
,
2
,
3

}






,







=



{

1
,
2
,
4

}



:







w

4
,

{

1
,
2
,
4

}




=


w

2
,

{

1
,
2
,
4

}



-

w

1
,

{

1
,
2
,
4

}






,







=



{

1
,
3
,
4

}



:







w

4
,

{

1
,
3
,
4

}




=


w

3
,

{

1
,
3
,
4

}



-

w

1
,

{

1
,
3
,
4

}






,






=



{

2
,
3
,
4

}



:







w

4
,

{

2
,
3
,
4

}




=


w

3
,

{

2
,
3
,
4

}



-


w

2
,

{

2
,
3
,
4

}



.











(

8

F

)







Next, the message matrix D may be formed by placing v and w symbols as specified in (6). The resulting message matrix is given by




embedded image


The next step for encoding the data is multiplying D by an encoder matrix Ψ. To this end, choose a finite field custom character13 (with at least n=8 distinct non-zero entries), and pick an 8×4 Vandermonde matrix generated by i=1, 2, 3, 4, 5, 6, 7, 8. The techniques described herein convert this matrix to a systematic MDS matrix by multiplying it from the right by the inverse of its top 4×4 matrix. That is,










Ψ

8
×
4


=



[




Ψ
1






Ψ
2






Ψ
3






Ψ
4






Ψ
5






Ψ
6






Ψ
7






Ψ
8




]







=




[




1
0




1
1




1
2




1
3






2
0




2
1




2
2




2
3






3
0




3
1




3
2




3
3






4
0




4
1




4
2




4
3






5
0




5
1




5
2




5
3






6
0




6
1




6
2




6
3






7
0




7
1




7
2




7
3






8
0




8
1




8
2




8
3




]

·











[




1
0




1
1




1
2




1
3






2
0




2
1




2
2




2
3






3
0




3
1




3
2




3
3






4
0




4
1




4
2




4
3




]


-
1








=




[



1


0


0


0




0


1


0


0




0


0


1


0




0


0


0


1




12


4


7


4




9


2


6


10




3


10


7


7




6


5


7


9



]




(

mod





13

)

.









Note that every k=4 rows of matrix Ψ are linearly independent, and form an invertible matrix. Then the content of node i is formed by row i in the matrix product Ψ·D, which are denoted by Ψi·D.


Data recovery from the content of any k=4 node is immediately implied by the MDS property of the encoder matrix. Next, the repair process for single and multiple failures is described.


Remark 1: Note that the construction of signed determinant codes decouples the parameters of the code. The encoder matrix only depends on n and d and remains the same for all modes. On the other hand, the message matrix D is fully determined by parameters (d, m), and does not depend on n, the total number of nodes in the system. Thus, the techniques described herein refer to the code defined above as a (d; m) signed determinant code and to matrix D as a (d; m) message matrix.










R

f
,

(
2
)













1




2







3







4






[













υ

1
,

{

1
,
2

}





ψ

f
,
2



+


υ

1
,

{

1
,
3

}





ψ

f
,
3



+


υ

1
,

{

1
,
4

}





ψ

f
,
4











υ

2
,

{

1
,
2

}





ψ

f
,
2



+


w

2
,

{

1
,
2
,
3

}





ψ

f
,
3



+


w

1
,

{

1
,
2
,
4

}





ψ

f
,
4














w

3
,

{

1
,
2
,
3

}





ψ

f
,
2



+


υ

3
,

{

1
,
3

}





ψ

f
,
3



+


w

3
,

{

1
,
3
,
4

}





ψ

f
,
4














w

4
,

{

1
,
2
,
4

}





ψ

f
,
2



+


w

4
,

{

1
,
3
,
4

}





ψ

f
,
3



+


υ

4
,

{

1
,
4

}





ψ

f
,
4







|



{
1
}










-

υ

1
,

{

1
,
2

}






ψ

f
,
1



+


w

1
,

{

1
,
2
,
3

}





ψ

f
,
3



+


w

1
,

{

1
,
2
,
4

}





ψ

f
,
4














-

υ

2
,

{

1
,
2

}






ψ

f
,
1



+


υ

2
,

{

2
,
3

}





ψ

f
,
3



+


υ

2
,

{

2
,
4

}





ψ

f
,
4














-

w

3
,

{

1
,
2
,
3

}






ψ

f
,
1



+


υ

3
,

{

2
,
3

}





ψ

f
,
3



+


w

3
,

{

2
,
3
,
4

}





ψ

f
,
4














-

w

4
,

{

1
,
2
,
4

}






ψ

f
,
1



+


w

4
,

{

2
,
3
,
4

}





ψ

f
,
3



+


υ

4
,

{

2
,
4

}





ψ

f
,
4











{
2
}






















-

υ

1
,

{

1
,
3

}






ψ

f
,
1



-


w

1
,

{

1
,
2
,
3

}





ψ

f
,
2



+


w

1
,

{

1
,
3
,
4

}





ψ

f
,
4











-

w

2
,

{

1
,
2
,
3

}






ψ

f
,
1



-


υ

2
,

{

2
,
3

}





ψ

f
,
2



+


w

2
,

{

2
,
3
,
4

}





ψ

f
,
4














-

υ

3
,

{

1
,
3

}






ψ

f
,
1



-


υ

3
,

{

2
,
3

}





ψ

f
,
2



+


υ

3
,

{

3
,
4

}





ψ

f
,
4














-

w

4
,

{

1
,
3
,
4

}






ψ

f
,
1



-


w

4
,

{

2
,
3
,
4

}





ψ

f
,
2



+


υ

4
,

{

3
,
4

}





ψ

f
,
4







|


{
3
}











-

υ

1
,

{

1
,
4

}






ψ

f
,
1



-


w

1
,

{

1
,
2
,
4

}





ψ

f
,
2



-


w

1
,

{

1
,
3
,
4

}





ψ

f
,
3











-

w

2
,

{

1
,
2
,
4

}






ψ

f
,
1



-


υ

2
,

{

2
,
4

}





ψ

f
,
2



-


w

2
,

{

2
,
3
,
4

}





ψ

f
,
3











-

w

3
,

{

1
,
3
,
4

}






ψ

f
,
1



-


w

3
,

{

2
,
3
,
4

}





ψ

f
,
2



-


υ

3
,

{

3
,
4

}





ψ

f
,
3











-

υ

4
,

{

1
,
4

}






ψ

f
,
1



-


υ

4
,

{

2
,
4

}





ψ

f
,
2



-


υ

4
,

{

3
,
4

}





ψ

f
,
3







]


{
4
}









First, suppose that a non-systematic node f fails, and the techniques described herein wish to repair it by the help of the systematic nodes custom character={1,2,3,4}, by downloading β=3 from each. The content of node f is given by Ψf·D, which includes α=6 symbols. Note that the content of this node is a row vector whose elements has the same labeling as the columns of D, i.e all m=2 elements subsets of [d]={1,2,3,4}. The symbols of this node are given by:









{







=


{

1
,
2

}



:















ψ

f
,
1




v

1
,

{

1
,
2

}




+


ψ

f
,
2




v

2
,

{

1
,
2

}




+


ψ

f
,
3




w

3
,

{

1
,
2
,
3

}




+


ψ

f
,
4




w

4
,

{

1
,
2
,
4

}





,










=


{

1
,
3

}



:















ψ

f
,
1




v

1
,

{

1
,
3

}




+


ψ

f
,
2




w

2
,

{

1
,
2
,
3

}




+


ψ

f
,
3




w

3
,

{

1
,
3

}




+


ψ

f
,
4




w

4
,

{

1
,
3
,
4

}





,










=


{

1
,
4

}



:















ψ

f
,
1




v

1
,

{

1
,
4

}




+


ψ

f
,
2




w

2
,

{

1
,
2
,
4

}




+


ψ

f
,
3




w

3
,

{

1
,
3
,
4

}




+


ψ

f
,
4




v

4
,

{

1
,
4

}





,










=


{

2
,
3

}



:















ψ

f
,
1




w

1
,

{

1
,
2
,
3

}




+


ψ

f
,
2




v

2
,

{

2
,
3

}




+


ψ

f
,
3




v

3
,

{

2
,
3

}




+


ψ

f
,
4




w

4
,

{

2
,
3
,
4

}





,










=


{

2
,
4

}



:















ψ

f
,
1




w

1
,

{

1
,
2
,
4

}




+


ψ

f
,
2




v

2
,

{

2
,
4

}




+


ψ

f
,
3




w

3
,

{

2
,
3
,
4

}




+


ψ

f
,
4




v

4
,

{

2
,
4

}





,










=


{

3
,
4

}



:














ψ

f
,
1




w

1
,

{

1
,
3
,
4

}




+


ψ

f
,
2




w

2
,

{

2
,
3
,
4

}




+


ψ

f
,
3




v

3
,

{

3
,
4

}




+


ψ

f
,
4





v

4
,

{

3
,
4

}



.











(

8

G

)







In the repair procedure using the systematic nodes as helpers, every symbol may be repaired by m nodes. Recall that d helper nodes contribute in the repair process by sending






β
=

(




d
-
1






m
-
1




)






symbols each, in order to repair






α
=

(



d




m



)






missing symbols. Hence, the number of repair equations per missing symbol is dβ/α=m, which matches with the proposed repair mechanism.


The m=2 helpers for each missing encoded symbol are those who have a copy of the corresponding v-symbols, e.g., for the symbol indexed by custom character={1,2} which has v1,{1,2} and v2,{1,2}, the contributing helpers are nodes 1 (who has a copy of v1,{1,2}) and node 2 (who stores a copy of v2,{1,2}). To this end, node 1 can send ψf,1v1,{1,2} which node 2 sends ψf,2v2,{1,2} to perform the repair.


It can be seen that the {1,2}-th missing symbols has also two other terms depending on w3,{1,2,3} and w4,{1,2,4}, which are stored at nodes 3 and 4, respectively. A naive repair mechanism requires these two nodes also to contribute in this repair procedure, which yields in a full data recovery in order to repair a failed node. Alternatively, the techniques described herein can reconstruct these w-symbols using the parity equations, and the content of the the first two helper nodes. Recall from (5) that

w3,{1,2,3}=−w1,{1,2,3}+w2,{1,2,3},
w4,{1,2,4}=−w1,{1,2,4}+w2,{1,2,4},


where w1,{1,2,3} and w1,{1,2,4} are stored in node 1 and w2,{1,2,3} and w2,{1,2,4} are stored in node 2. Hence, the content of nodes 1 and 2 are sufficient to reconstruct the {1,2}-th symbol at the failed node f. To this end, node 1 computes A=ψf,1v1,{1,2}−ψf,3w1,{1,2,3}−ψf,4w1,{1,2,4} (a linear combination of its first, forth, and fifth entries), and sends A to f. Similarly, node 2 sends B=ψf,2v2,{1,2}f,3w2,{1,2,3}f,4w2,{1,2,4} (a linear combination of its first, second, and third coded symbols). Upon receiving these symbols, the {1,2}-th missing symbols of node f can be recovered from













A
+
B

=




(



ψ

f
,
1




v

1
,

{

1
,
2

}




-


ψ

f
,
3




w

1
,

{

1
,
2
,
3

}




-


ψ

f
,
4




w

1
,

{

1
,
2
,
4

}





)

+










(



ψ

f
,
2




v

2
,

{

1
,
2

}




+


ψ

f
,
3




w

2
,

{

1
,
2
,
3

}




+


ψ

f
,
4




w

2
,

{

1
,
2
,
4

}





)







=





ψ

f
,
1




v

1
,

(

1
,
2

}




+


ψ

f
,
2




v

2
,

{

1
,
2

}




+












ψ

f
,
3




(


w

2
,

(

1
,
2
,
3

}



-

w

1
,

{

1
,
2
,
3

}




)


+











ψ

f
,
4




(


w

2
,

(

1
,
2
,
4

}



-

w

1
,

{

1
,
2
,
4

}




)








=





ψ

f
,
1




v

1
,

(

1
,
2

}




+


ψ

f
,
2




v

2
,

{

1
,
2

}




+












ψ

f
,
3




w

3
,

(

1
,
2
,
3

}




+


ψ

f
,
4





w

4
,

{

1
,
2
,
4

}



.










(

8

H

)







In general, v symbols are repaired directly by communicating an identical copy of them, while w symbols are repaired indirectly, using their parity equations. This is the general rule that the techniques described herein use for repair of all other missing symbols of node f. It can be seen that each helper node participates in the repair of β=3 missing symbols, by sending one repair symbol for each. For instance, node 1 contributes in the repair of symbols indexed by {1,2}, {1,3}, and {1,4}. The repair equation sent by node 1 for each these repair scenarios are listed below:


Repair Symbols Sent by Node 1:








{





𝒥
=



{

1
,
2

}

:

ψ

f
,
1




v

1
,

{

1
,
2

}




-


ψ

f
,
3




w

1
,

{

1
,
2
,
3

}




-


ψ

f
,
4




w

1
,

{

1
,
2
,
4

}






,







𝒥
=



{

1
,
3

}

:

ψ

f
,
1




v

1
,

{

1
,
3

}




+


ψ

f
,
2




w

1
,

{

1
,
2
,
3

}




-


ψ

f
,
4




w

1
,

{

1
,
3
,
4

}






,






𝒥
=



{

1
,
4

}

:

ψ

f
,
1




v

1
,

{

1
,
4

}




+


ψ

f
,
2




w

1
,

{

1
,
2
,
4

}




+


ψ

f
,
3





w

1
,

{

1
,
3
,
4

}



.











(
81
)







Similarly, the repair symbols sent from helper nodes 2, 3, and 4 are given by


Repair Symbols Sent by Node 2:








{





=



{

1
,
2

}



:







ψ

f
,
2




v

2
,

{

1
,
2

}




+


ψ

f
,
3




v

2
,

{

1
,
2
,
3

}




+


ψ

f
,
4




w

2
,

{

1
,
2
,
4

}






,







=



{

2
,
3

}



:







ψ

f
,
1




w

2
,

{

1
,
2
,
3

}




+


ψ

f
,
2




v

2
,

{

2
,
3

}




-


ψ

f
,
4




w

2
,

{

2
,
3
,
4

}






,







=



{

2
,
4

}



:







ψ

f
,
1




w

2
,

{

1
,
2
,
4

}




+


ψ

f
,
2




v

2
,

{

2
,
4

}




+


ψ

f
,
3




w

2
,

{

2
,
3
,
4

}






,








(

8

J

)







Repair Symbols Sent by Node 3:








{





=



{

1
,
3

}



:







ψ

f
,
2




w

3
,

{

1
,
2
,
3

}




+


ψ

f
,
3




v

3
,

{

1
,
3

}




+


ψ

f
,
4




w

3
,

{

1
,
3
,
4

}






,







=



{

2
,
3

}



:






-


ψ

f
,
1




w

3
,

{

1
,
2
,
3

}




+


ψ

f
,
3




v

3
,

{

2
,
3

}




+


ψ

f
,
4




w

3
,

{

2
,
3
,
4

}






,







=



{

3
,
4

}



:







ψ

f
,
1




w

3
,

{

1
,
3
,
4

}




+


ψ

f
,
2




w

3
,

{

2
,
3
,
4

}




+


ψ

f
,
3




v

3
,

{

3
,
4

}






,








(

8

K

)







Repair Symbols Sent by Node 4:








{





=



{

1
,
4

}



:







ψ

f
,
2




w

4
,

{

1
,
2
,
4

}




+


ψ

f
,
3




w

4
,

{

1
,
3
,
4

}




+


ψ

f
,
4




v

4
,

{

1
,
4

}






,







=



{

2
,
4

}



:






-


ψ

f
,
1




w

4
,

{

1
,
2
,
4

}




+


ψ

f
,
3




w

4
,

{

2
,
3
,
4

}




+


ψ

f
,
4




v

4
,

{

2
,
4

}






,






=



{

3
,
4

}



:






-


ψ

f
,
1




w

4
,

{

1
,
3
,
4

}




-


ψ

f
,
2




w

4
,

{

2
,
3
,
4

}




+


ψ

f
,
4





v

4
,

{

3
,
4

}



.











(

8

L

)







The repair symbols of helper node h∈{1,2,3,4} in (25)-(28) could be obtain from Ψh·D·Ξf,(2), which is the content of the helper nodes (i.e., Ψh·D) times the repair encoder matrix for m=2 (i.e., Ξf,(2)) defined as:







Ξ

f
,

(
2
)



=
























{

1
,
2

}









{

1
,
3

}









{

1
,
4

}









{

2
,
3

}












{

2
,
4

}






{

3
,
4

}
















{
1
}










{
2
}









{
3
}









{
4
}










[




ψ

f
,
2





-

ψ

f
,
1





0


0





ψ

f
,
3




0



-

ψ

f
,
1





0





ψ

f
,
4




0


0



-

ψ

f
,
1







0



ψ

f
,
3





-

ψ

f
,
2





0




0



ψ

f
,
4




0



-

ψ

f
,
2







0


0



ψ

f
,
4





-

ψ

f
,
3






]









Note that, even though this matrix has







(



4





2
-
1




)

=
4





columns, and hence, Ψh·D·Ξf,(2) is a vector of length 4, it suffices to communicate only β=3 symbols from the helper node to the failed node and the fourth symbol can be reconstructed from the other 3 symbols at the failed node. This is due to the fact that the rank of matrix Ξf,(2) equals to β=3. More precisely, a non-zero linear combination of the columns of Ξf,(2) is zero, that is,

Ξf,(2)·[ψf,1ψf,2ψf,3ψf,4]T=0.  (8M)


Therefore, (if ψf,4≠0) the helper node h only sends the first β=3 symbols of the vector Ψh·D·Ξf,(2), namely, [Ψh·D·Ξf,(2)]1, [Ψh·D·Ξf,(2)]2, and [Ψh·D·Ξf,(2)]3, and the forth symbol [Ψh·D·Ξf,(2)]4 can be appended to it at node f from








[


Ψ
h

·
D
·

Ξ

f
,

(
2
)




]

4

=


ψ

f
,
4


-
1


·




i
=
1

3




ψ

f
,
i


·



[


Ψ
h

·
D
·

Ξ

f
,

(
2
)




]

i

.








Upon receiving the repair data from d=4 helper nodes {1,2,3,4}, namely {Ψ1·D·Ξf,(2), Ψ2·D·Ξf,(2), Ψ3·D·Ξf,(2), Ψ4·D·Ξf,(2)}, the failed can stack them to obtain a matrix











[





Ψ
1

·
D
·

Ξ

f
,

(
2
)










Ψ
2

·
D
·

Ξ

f
,

(
2
)










Ψ
3

·
D
·

Ξ

f
,

(
2
)










Ψ
4

·
D
·

Ξ

f
,

(
2
)







]

=



Ψ


[


{

1
,
2
,
3
,
4

}

,
:

]


·
D
·

Ξ

f
,

(
2
)




=

D
·

Ξ

f
,

(
2
)






,




(

8

N

)







where the last identity is due to the fact that Ψ[{1,2,3,4}, :]=I is the identity matrix. The techniques described herein refer to this matrix by the repair space matrix of node f, and denote it by Rf,(2)=D·Ξf,(2)


Using the entries of matrix Rf,(2), the techniques described herein can reconstruct the missing coded symbols of the failed node, as formulated in (9). For the sake of illustration, let us focus on the symbol at the position custom character={2,4} of node f. Recall that rows of matrix Rf,(2) are indexed by numbers in [d]={1,2,3,4} and its columns are indexed by subsets of size m−1=1 of [d]. The custom character-th symbol of node f can be found from a linear combination (with +1 and −1 coefficients) of entries of Rf,(2) a positioned at row x and column custom character\{x} for all x∈custom character. The coefficients used in this linear combination is given by order of number x in set custom character, e.g., x=2 is the first (smallest) element of custom character={2,4}, hence ind{2,4}(2)=1, and the corresponding coefficient may be (−1)ind(2,4)(2)=(−1)1=−1. Putting things together, the techniques described herein obtain
















x


{

2
,
4

}







(

-
1

)


ind


{

2
,
4

}


(
x
)






R

x
,


{

2
,
4

}


\


{
x
}




f
,

(
2
)





=




-

R

2
,

{
4
}



f
,

(
2
)




+

R

4
,

{
2
}



f
,

(
m
)










=




-




[



-

ψ

f
,
2





v

2
,

{

2
,
4

}




-


ψ

f
,
1




w

2
,

{

1
,
2
,
4

}




-









ψ

f
,
3




w

2
,

{

2
,
3
,
4

}




]





+













[



ψ

f
,
4




v

4
,

{

2
,
4

}




-


ψ

f
,
1




w

4
,

{

1
,
2
,
4

}




+









ψ

f
,
3




w

4
,

{

2
,
3
,
4

}




]










=





ψ

f
,
2




v

2
,

{

2
,
4

}




+


ψ

f
,
4




v

4
,

{
2.4
}




+












ψ

f
,
1




(


w

2
,

{

1
,
2
,
4

}



-

w

4
,

{

1
,
2
,
4

}




)


+











ψ

f
,
3




(


w

2
,

{

2
,
3
,
4

}



+

w

4
,

{

2
,
3
,
4

}




)








=





ψ

f
,
2




v

2
,

{

2
,
4

}




+


ψ

f
,
4




v

4
,

{

2
,
4

}




+












ψ

f
,
1




w

1
,

{

1
,
2
,
4

}




+


ψ

f
,
3




w

2
,

{

2
,
3
,
4

}












=




[


Ψ
3

·
D

]


{

2
,
4

}



,







(

8

O

)







where in (8O) used the parity equations defined in (11). A general repair scenario with an arbitrary (not necessarily systematic) set of helper nodes custom character with |custom character|=d=4 is very similar to that from the systematic nodes, explained above.


Each helper node h∈custom character computes its repair data by multiplying its content by the repair encoder matrix of failed node f, and sends it to the failed node. The failed node collects {Ψh·D·Ξf,(2): h∈custom character} and stacks them to form the matrix Ψ[custom character, :]·D·Ξf,(2), where Ψ[custom character, :] is the sub-matrix of Ψ obtained by stacking rows indexed by {h: h∈custom character}. The main difference compared to the systematic helpers case is that unlike in (31), Ψ[custom character, :] is not an identity matrix in general. However, since Ψ[custom character, :] is an invertible matrix, the techniques described herein can compute Rf,(2)=D·Ξf,(2) as

Rf,(2)=Ψ[custom character,:]−1·(Ψ[custom character,:]·D·Ξf,(2))=D·Ξf,(2).


Once Rf,(2) is computed at node f, the rest of the process is identical the repair from systematic helper nodes.










Symbol





1

=



-


ψ


f
1

,
2



ψ


f
1

,
1




×
Symbol





2

-



ψ


f
1

,
3



ψ


f
1

,
1



×
Symbol





3

-



ψ


f
1

,
4



ψ


f
1

,
1



×
Symbol





4






(

8

P

)







Symbol





5

=



-


ψ


f
2

,
2



ψ


f
2

,
1




×
Symbol





6

-



ψ


f
2

,
3



ψ


f
2

,
1



×
Symbol





7

-



ψ


f
2

,
4



ψ


f
2

,
1



×
Symbol





8.






(

8

Q

)







Symbol





8

=





ψ


f
2

,
1




(



ψ


f
1

,
1




ψ


f
2

,
2



-


ψ


f
1

,
2




ψ


f
2

,
1




)




ψ


f
1

,
1




(



ψ


f
1

,
1




ψ


f
2

,
4



-


ψ


f
1

,
4




ψ


f
2

,
1




)



×
Symbol





2

+





ψ


f
2

,
1




(



ψ


f
1

,
1




ψ


f
2

,
3



-


ψ


f
1

,
3




ψ


f
2

,
1




)




ψ


f
1

,
1




(



ψ


f
1

,
1




ψ


f
2

,
4



-


ψ


f
1

,
4




ψ


f
2

,
1




)



.



.



.





×




Symbol





3

+



ψ


f
2

,
1



ψ


f
1

,
1




Symbol





4

+






ψ


f
1

,
1




ψ


f
2

,
2



-


ψ


f
1

,
2




ψ


f
2

,
1







ψ


f
1

,
4




ψ


f
2

,
1



-


ψ


f
1

,
1




ψ


f
2

,
4





.



.



.





×
Symbol





6

+





ψ


f
1

,
1




ψ


f
2

,
3



-


ψ


f
1

,
3




ψ


f
2

,
1







ψ


f
1

,
4




ψ


f
2

,
1



-


ψ


f
1

,
1




ψ


f
2

,
4






Symbol





7






(

8

R

)







Now, assume two non-systematic nodes in custom character={f1, f2} are simultaneously failed, and the goal is to reconstruct the missing data on f1 and f2 using the systematic nodes, i.e., the helper set is custom character={1,2,3,4}. A naive approach is to repeat the repair scenario for f1 and f2. Such a separation-based scheme requires downloading 2β=6 (coded) symbols from each helper node. Alternatively, the techniques described herein show that the repair of nodes f1 and f2 can be performed by downloading only β2=5 symbols from each helper.


The techniques described herein start with Ξ{f1,f2},(2) which is basically the concatenation of Ξf1,(2) and Ξf2,(2). This matrix is given in (22) below. This is a 6×8 matrix. However, the claim of Proposition 3 implies the rank of this matrix is at most 5. To show this claim, the techniques described herein define the non-zero vector custom character defined in (23) below and show that this vector is in the left null-space of Ξ{f1,f2},(2). The general construction of the null-space is presented in the proof of Proposition 3.











Ξ

ɛ
,

(
2
)



=


[


Ξ


f
1

,

(
2
)



|

Ξ


f
2

,

(
2
)




]

=





















{

1
,
2

}









{

1
,
3

}









{

1
,
4

}












{

2
,
3

}









{

2
,
4

}






{

3
,
4

}


















{
1
}









{
2
}










{
3
}










{
4
}










{
1
}














{
2
}














{
3
}














{
4
}




















[





ψ


f
1

,
2





-

ψ


f
1

,
1





0


0





ψ


f
1

,
3




0



-

ψ


f
1

,
1





0





ψ


f
1

,
4




0


0



-

ψ


f
1

,
1







0



ψ


f
1

,
3





-

ψ


f
1

,
2





0




0



ψ


f
1

,
4




0



-

ψ


f
1

,
2







0


0



ψ


f
1

,
4





-

ψ


f
1

,
3






|




ψ


f
2

,
2





-

ψ


f
2

,
1





0


0





ψ


f
2

,
3




0



-

ψ


f
2

,
1





0





ψ


f
2

,
4




0


0



-

ψ


f
2

,
1







0



ψ


f
2

,
3





-

ψ


f
2

,
2





0




0



ψ


f
2

,
4




0



-

ψ


f
2

,
2







0


0



ψ


f
2

,
4





-

ψ


f
2

,
3







]




















(
22
)















Y

ɛ
,

(
2
)



=




{

3
,
4

}






[


-







ψ


j
1

,
3





ψ


f
1

,
4







ψ


j
2

,
3





ψ


f
2

,
4








{

1
,
2

}



,









ψ


j
1

,
2





ψ


f
1

,
4







ψ


j
2

,
1





ψ


f
2

,
4







,


{

1
,
3

}


-







ψ


f
1

,
2





ψ


f
1

,
3







ψ


f
2

,
2





ψ


f
2

,
3








{

1
,
4

}



,








-







ψ


f
1

,
1





ψ


f
1

,
4







ψ


f
2

,
1





ψ


j
2

,
4








{

2
,
3

}



,

-







ψ


f
1

,
1





ψ


j
1

,
3







ψ


f
2

,
1





ψ


j
2

,
3








{

2
,
4

}



,

-








ψ


f
1

,
1





ψ


f
1

,
2







ψ


f
2

,
1





ψ


j
2

,
2







]


{

3
,
4

}















=




{

3
,
4

}






[





ψ


j
1

,
4




ψ


f
2

,
3



-


ψ


f
1

,
3




ψ


f
2

,
4





{

1
,
2

}











ψ


f
1

,
2




ψ


f
2

,
4



-


ψ


f
1

,
4




ψ


j
2

,
2





{

1
,
3

}











ψ


f
1

,
3




ψ


f
2

,
2



-


ψ


j
1

,
2




ψ


f
2

,
3





{

1
,
4

}














ψ


j
1

,
4




ψ


f
2

,
1



-


ψ


f
1

,
1




ψ


f
2

,
4





{

2
,
3

}











ψ


f
1

,
1




ψ


f
2

,
3



-


ψ


f
1

,
2




ψ


j
2

,
1





{

2
,
4

}











ψ


f
1

,
2




ψ


f
2

,
1



-


ψ


j
1

,
1




ψ


f
2

,
2





{

3
,
4

}



]












(
23
)







First note that custom character is not an all-zero vector, otherwise









ψ


f
1

,
1



ψ


f
2

,
1



=



ψ


f
1

,
2



ψ


f
2

,
2



=



ψ


f
1

,
3



ψ


f
2

,
3



=


ψ


f
1

,
4



ψ


f
2

,
4






,




which implies rows Ψf1 and Ψf2 of the encoder matrix are linearly dependent. This is in contradiction with the fact that every d=4 rows of Ψ are linearly independent. Hence, without loss of generality,












ψ


f
1

,
1





ψ


f
1

,
4







ψ


f
2

,
1





ψ


f
2

,
4








0.




In order to prove








Y


,

(
2
)



·



,

(
2
)




=
0




the techniques described herein show that vector custom character is orthogonal to each column of custom character. For instance, consider the seventh column of custom character, labeled by {3} in segment Ξf2,(2). The inner product of custom character and this column is given by










-

ψ


f
2

,
1










ψ


f
1

,
2





ψ


f
1

,
4







ψ


f
2

,
2





ψ


f
2

,
4








+


ψ


f
2

,
2









ψ


f
1

,
1





ψ


f
1

,
4







ψ


f
2

,
1





ψ


f
2

,
4








-


ψ


f
2

,
4









ψ


f
1

,
1





ψ


f
1

,
2







ψ


f
2

,
1





ψ


f
2

,
2









=


-






ψ


f
2

,
1





ψ


f
2

,
2





ψ


f
2

,
4







ψ


f
1

,
1





ψ


f
1

,
2





ψ


f
1

,
4







ψ


f
2

,
1





ψ


f
2


21





ψ


f
2

,
4








=
0


,




where the first equality follows from the Laplace expansion of the determinant with respect to the first row, and the second equality is due to the fact that the first and third rows of the matrix are identical, and hence it is rank-deficient. The existence of a non-zero vector in the left null-space of custom character implies its rank is at most β2(2)=5.


Now, assume h=1 is one of the helper nodes. Without loss of generality, the techniques described herein assume ψf1,1≠0 and ψf2,1≠0. The repair data sent by node custom character, has the following 8 symbols









{









Symbol





1


:







ψ


f
1

,
2




v

1
,

{

1
,
2

}




+


ψ


f
1

,
3




v

1
,

{

1
,
3

}




+


ψ


f
1

,
4




v

1
,

{

1
,
4

}





,












Symbol





2


:






-


ψ


f
1

,
1




v

1
,

{

1
,
2

}




+


ψ


f
1

,
3




w

1
,

{

1
,
2
,
3

}




+


ψ


f
1

,
4




w

1
,

{

1
,
2
,
4

}





,












Symbol





3


:






-


ψ


f
1

,
1




v

1
,

{

1
,
3

}




-


ψ


f
1

,
2




w

1
,

{

1
,
2
,
3

}




+


ψ


f
1

,
4




w

1
,

{

1
,
3
,
4

}





,











Symbol





4


:






-


ψ


f
1

,
1




v

1
,

{

1
,
4

}




-


ψ


f
1

,
2




w

1
,

{

1
,
2
,
4

}




-


ψ


f
1

,
3




w

1
,

{

1
,
3
,
4

}
















Symbol





5


:







ψ


f
2

,
2




v

1
,

{

1
,
2

}




+


ψ


f
2

,
3




v

1
,

{

1
,
3

}




+


ψ


f
2

,
4




v

1
,

{

1
,
4

}





,












Symbol





6


:






-






ψ


f
2

,
1




v

1
,

{

1
,
2

}




+


ψ


f
2

,
3




w

1
,

{

1
,
2
,
3

}




+


ψ


f
2

,
4




w

1
,

{

1
,
2
,
4

}





,












Symbol





7


:






-


ψ


f
2

,
1




v

1
,

{

1
,
3

}




-


ψ


f
2

,
2




w

1
,

{

1
,
2
,
3

}




+


ψ


f
2

,
4




w

1
,

{

1
,
3
,
4

}





,












Symbol





8


:






-


ψ


f
2

,
1




v

1
,

{

1
,
4

}




-


ψ


f
2

,
2




w

1
,

{

1
,
2
,
4

}




-


ψ


f
2

,
3




w

1
,

{

1
,
3
,
4

}





,









(

8

R

)







However, Symbol 1, Symbol 5, and Symbol 8 are redundant, and can be reconstructed as linear combinations of the remaining five symbols. It is worth noting that the first and second equations above indicate the dependencies between symbols that are sent for the repair of f1 and f2, respectively. The third equation, however, shows an additional dependency across the repair symbols f1 and and those of f2. This implies that it suffices for the helper node 1 to send symbols 2, 3, 4, 6, and 7 in to repair two nodes f1 and f2, simultaneously.


Next, the techniques described herein review data recovery and exact repair properties for signed determinant codes.


Proposition 1: In a (d; m) signed determinant code, all the data symbols can be recovered from the content of any k=d nodes.


The proof of this proposition is similar to that of Proposition 1, and hence omitted for the sake of brevity.


Consider the repair process of a failed node f∈[n] from an arbitrary set custom character of |custom character|=d helpers. The repair-encoder matrix for a (d; m) signed determinant code is defined below.


Definition 1: For a (d; m) signed determinant code with signature σ, and a failed node f∈[n], the repair-encoder matrix Ξf,(m) is defined as








a


(



d




m



)


×

(



d





m
-
1




)






matrix

,





whose rows are labeled by m-element subsets of [d] and columns are labeled by (m−1)-element subsets of [d]. The element in row custom character and column custom character of this matrix is given by









=

{






(

-
1

)




ψ

f
,
x








if






=

{
x
}


,





0



otherwise
.









(
9
)







Here σ (x)'s are the same signature used in (8).


In order to repair node f, each helper node h∈custom character multiplies its content Ψh·D by the repair-encoder matrix of node f to obtain Ψh·D·Ξf,(m), and sends the result to node f. Upon receiving d vectors {Ψh·D·Ξf,(m): h∈custom character} and stacking them into a matrix, the failed node obtains custom character, where custom character is a full-rank sub-matrix of Ψ, obtained from stacking rows indexed by h∈custom character. Multiplying by custom character, the failed node retrieves

Rf(D)=D·Ξf,(m),  (10)


which is called the repair space of node f. All the coded symbols in node f can be recovered from its repair space as described in the following proposition.


Proposition 2: The coded symbol in column custom character of node f can be recovered from











[


Ψ
f

·
D

]

𝒥

=




x

𝒥








(

-
1

)




σ
D

(
x
)

+


ind
𝒥

(
x
)



[


R
f

(
D
)

]


x
,

𝒥


\[
x
]




.






(
11
)







This proposition is very similar to Proposition 2. However, due to modification introduced here, the techniques described herein present the proof of the current proposition below.


The required repair-bandwidth of this repair scheme is given below.


Proposition 3: The matrix Ξf,(m) defined in (9) has rank







β
m

=


(




d
-
1






m
-
1




)

.






Therefore, even though the number of entries in vector









Ψ
h

·
D
·

Ξ

f
,

(
m
)









is






(



d





m
-
1




)


,





it can be fully sent to the failed node by communicating only







β
m

=

(




d
-
1






m
-
1




)






symbols in custom character.


Remark 2: Note that a signed determinant code can be defined over any Galois field. In particular, for a code designed over GF(2S) with characteristic 2, the techniques described herein have −1=+1, and hence, all the positive and negative signs disappear. Especially, the signs in (8) can be removed and the parity equation in (7) may simply reduce to custom character=0. Also, the non-zero entries of the repair encoder matrix in (9) may be ψf,x, and the repair equation in (26) may be replaced by








[


Ψ
f

·
S

]

𝒥

=




x

𝒥





[


R
f

(
D
)

]


x
,

𝒥


{
x
}




.






In this section, the techniques described herein present a simple example, through which the techniques described herein explain the core idea of this code construction. More precisely, the techniques described herein show how cascading several signed determinant codes allows us to break the constraint k=d of the signed determinant codes.


The techniques described herein consider a code with parameters (n=6, k=3, d=4) corresponding to mode μ=3. Therefore, from (3) the parameters of this code are given by (α, β, F)=(8,4,24). To construct such a code, the techniques described herein start by an (n′=n=6, k′=d=4, d′=d=4) signed determinant code of mode m=μ=3 with an all-zero signature. The message matrix for this code is given by














{

1
,
2
,
3

}








{

1
,
2
,
4

}








{

1
,
3
,
4

}







{

2
,
3
,
4

}









=





1






2






3






4




[




v

1
,

{

1
,
2
,
3

}



<
0
>





v

1
,

{

1
,
2
,
4

}



<
0
>





v

1
,

{

1
,
3
,
4

}



<
0
>





w

1
,

{

1
,
2
,
3
,
4

}



<
0
>







v

2
,

{

1
,
2
,
3

}



<
0
>





v

2
,

{

1
,
2
,
4

}



<
0
>





w

2
,

{

1
,
2
,
3
,
4

}



<
0
>





v

2
,

{

2
,
3
,
4

}



<
0
>







v

3
,

{

1
,
2
,
3

}



<
0
>





w

3
,

{

1
,
2
,
3
,
4

}



<
0
>





v

3
,

{

1
,
3
,
4

}



<
0
>





v

3
,

{

2
,
3
,
4

}



<
0
>







w

4
,

{

1
,
2
,
3
,
4

}



<
0
>





v

4
,

{

1
,
2
,
4

}



<
0
>





v

4
,

{

1
,
3
,
4

}



<
0
>





v

4
,

{

2
,
3
,
4

}



<
0
>





]







Here, the superscript <0> is used to distinguish between the entries of this matrix and another message matrix which may be introduced in the following.


Let us consider a semi-systematic encoder matrix Ψ with n=n′=6 rows. Existence of such semi-systematic encoders with the desired properties is shown.







Ψ

6
×
4


=


[



1


0


0


0




0


1


0


0




0


0


1


0





ψ

4
,
1





ψ

4
,
2





ψ

4
,
3





ψ

4
,
4







ψ

5
,
1





ψ

5
,
2





ψ

5
,
3





ψ

5
,
4







ψ

6
,
1





ψ

6
,
2





ψ

6
,
3





ψ

6
,
4





]

.





The repair space of this code is given by







Ψ

6
×
4


=

[



1


0


0


0




0


1


0


0




0


0


1


0





ψ

4
,
1





ψ

4
,
2





ψ

4
,
3





ψ

4
,
4







ψ

5
,
1





ψ

5
,
2





ψ

5
,
3





ψ

5
,
4







ψ

6
,
1





ψ

6
,
2





ψ

6
,
3





ψ

6
,
4





]





It is easy to see that the content of the failed node can be recovered as linear combinations of entries of Rf(P), since d=d′=4. For instance, the entry custom character={1,3,4} of the P-segment of a failed node f is repaired using the equation (11) that is











[


Ψ
f

·
P

]


{

1
,
3
,
4

}


=






x


{

1
,
3
,
4

}








(

-
1

)


ind


{

1
,
3
,
4

}


(
x
)






[


R

f
,

(
4
)





(
P
)


]



x
,


{

1
,
3
,
4

}


\


{
x
}











=




-

(



-

v

1
,

{

1
,
3
,
4

}




0






ψ

f
,
1



-


w

1
,

{

1
,
2
,
3
,
4

}




0





ψ

f
,
2




)


+











(



v

3
,

{

1
,
3
,
4

}




0





ψ

f
,
3



+


w

3
,

{

1
,
2
,
3
,
4

}




0





ψ

f
,
2




)

-










(



w

4
,

{

1
,
2
,
3
,
4

}




0





ψ

f
,
2



-


v

4
,

{

1
,
3
,
4

}




0





ψ

f
,
4




)







=





v

1
,

{

1
,
3
,
4

}




0





ψ

f
,
1



+


v

3
,

{

1
,
3
,
4

}




0





ψ

f
,
3



+


v

4
,

{

1
,
3
,
4

}




0





ψ

f
,
4



+











(


w

1
,

{

1
,
2
,
3
,
4

}




0



+

w

3
,

{

1
,
2
,
3
,
4

}




0



-

w

4
,

{

1
,
2
,
3
,
4

}




0




)



ψ

f
,
2









=





v

1
,

{

1
,
3
,
4

}




0





ψ

f
,
1



+


v

3
,

{

1
,
3
,
4

}




0





ψ

f
,
3



+












v

4
,

{

1
,
3
,
4

}




0





ψ

f
,
4



+


w

2
,

{

1
,
2
,
3
,
4

}




0





ψ

f
,
2











Moreover, the code generated by P maintains data recovery from any k′=4 nodes. The question is what happens if the data collector can only access k=3 nodes.


Let us consider the data recovery from the first k=3 systematic nodes (instead of k′=4 nodes). The contents of the systematic node i is identical to the corresponding rows i of P, for i=1,2,3. Therefore, the data collector accessing nodes in custom character{1,2,3} cannot recover the four symbols in the last row of P. Note that symbol w4,{1,2,3,4}<0> is a parity symbol, and can be still recovered from observed symbols w1,{1,2,3,4}<0>, w2,{1,2,3,4}<0>, w3,{1,2,3,4}<0>, through the parity equation in (7):

w4,{1,2,3,4}<0>=w1,{1,2,3,4}<0>−w2,{1,2,3,4}<0>+w3,{1,2,3,4}<0>.


However, the other three symbols v4,{1,2,4}<0>, v4,{1,3,4}<0>, v4,{2,3,4}<0> cannot be recovered by the data collector.


In order to recover these missing symbols, the techniques described herein use another signed determinant code (with a lower mode), and use its parity entries to keep a backup copy for each of the missing symbols. The process of copying the missing symbols to the parity entries of another message matrix is called injection throughout this disclosure. Injection simply means adding a symbol to the entry of massage matrix that contains a parity symbol. The techniques described herein also refer to the message matrix to which injection happens as the child node of the injection. Similarly, the message matrix whose symbols are injected is called the parent code of the injection. In this example, the child code is of mode m=1 and an all-zero signature σQ=(0,0,0,0), whose message matrix is given by














{
1
}








{
2
}








{
3
}







{
4
}









=





1






2






3






4




[




v

1
,

{
1
}



<
1
>





w

1
,

{

1
,
2

}



<
1
>





w

1
,

{

1
,
3

}



<
1
>





w

1
,

{

1
,
4

}



<
1
>







w

2
,

{

1
,
2

}



<
1
>





v

2
,

{
2
}



<
1
>





w

2
,

{

2
,
3

}



<
1
>





w

2
,

{

2
,
4

}



<
1
>







w

3
,

{

1
,
3

}



<
1
>





w

3
,

{

2
,
3

}



<
1
>





v

3
,

{
3
}



<
1
>





w

3
,

{

3
,
4

}



<
1
>







w

4
,

{

1
,
4

}



<
1
>





w

4
,

{

2
,
4

}



<
1
>





w

4
,

{

3
,
4

}



<
1
>





v

4
,

{
4
}



<
1
>





]







Note that there are three redundant symbols in the top three rows of the Q, since the parity equation in (7) implies

w2,{1,2}<1>=w1,{1,2}<1>,
w3,{1,3}<1>=w1,{1,3}<1>,
w3,{2,3}<1>=w2,{2,3}<1>.


hence, the techniques described herein can use these redundant symbols to backup the missing symbols of P. The techniques described herein denote the modified version of the message matrix Q by Q. Concatenating P and Q results in a super-message matrix:














{

1
,
2
,
3

}








{

1
,
2
,
4

}








{

1
,
3
,
4

}







{

2
,
3
,
4

}







{
1
}







{
2
}






{
3
}






{
4
}








M
=


[






"\[LeftBracketingBar]"

Q


]

=





1






2






3






4




[




v

1
,

{

1
,
2
,
3

}



<
0
>





v

1
,

{

1
,
2
,
4

}



<
0
>





v

1
,

{

1
,
3
,
4

}



<
0
>





w

1
,

{

1
,
2
,
3
,
4

}



<
0
>











v

1
,

{
1
}



<
1
>





w

1
,

{

1
,
2

}



<
1
>





w

1
,

{

1
,
3

}



<
1
>





w

1
,

{

1
,
4

}



<
1
>







v

2
,

{

1
,
2
,
3

}



<
0
>





v

2
,

{

1
,
2
,
4

}



<
0
>





w

2
,

{

1
,
2
,
3
,
4

}



<
0
>





v

2
,

{

2
,
3
,
4

}



<
0
>












w

2
,

{

1
,
2

}



<
1
>


+
A




v

2
,

{
2
}



<
1
>





w

2
,

{

2
,
3

}



<
1
>





w

2
,

{

2
,
4

}



<
1
>







v

3
,

{

1
,
2
,
3

}



<
0
>





w

3
,

{

1
,
2
,
3
,
4

}



<
0
>





v

3
,

{

1
,
3
,
4

}



<
0
>





v

3
,

{

2
,
3
,
4

}



<
0
>












w

3
,

{

1
,
3

}



<
1
>


+
B





w

3
,

{

2
,
3

}



<
1
>


+
C




v

3
,

{
3
}



<
1
>





w

3
,

{

3
,
4

}



<
1
>







w

4
,

{

1
,
2
,
3
,
4

}



<
0
>





v

4
,

{

1
,
2
,
4

}



<
0
>





v

4
,

{

1
,
3
,
4

}



<
0
>





v

4
,

{

2
,
3
,
4

}



<
0
>











w

4
,

{

1
,
4

}



<
1
>





w

4
,

{

2
,
4

}



<
1
>





w

4
,

{

3
,
4

}



<
1
>





v

4
,

{
4
}



<
1
>





]








where A, B, and C may be determined such that the missing symbols v4,{1,2,4}<0>, v4,{1,3,4}<0>, v4,{2,3,4}<0> can be fully retrieved from (A, B, C). Note that with this modification, the recovery problem (from the first k=3 nodes) may be resolved.


Now, consider a failed node f. Note that before the injection, each entry [Ψf·Q] could be recovered from the repair space Rf(Q), given by


It is easy to verify that using Equation (11) the repair of entry custom character={x} of the child code is given by











R
f

(

)

=









1




2




3




4



[







-

v

1
,

{
1
}



<
1
>




ψ

f

,
1






-

w

1
,

{

1
,
2

}



<
1
>




ψ

f

,
2






-

w

1
,

{

1
,
3

}



<
1
>




ψ

f

,
3






-

w

1
,

{

1
,
4

}



<
1
>




ψ

f

,
4








-

w

2
,

{

1
,
2

}



<
1
>




ψ

f

,
1






-

v

2
,

{
2
}



<
1
>




ψ

f

,
2






-

w

2
,

{

2
,
3

}



<
1
>




ψ

f

,
3






-

w

2
,

{

2
,
4

}



<
1
>




ψ

f

,
4








-

w

3
,

{

1
,
3

}



<
1
>




ψ

f

,
1






-

w

3
,

{

2
,
3

}



<
1
>




ψ

f

,
2






-

v

3
,

{
3
}



<
1
>




ψ

f

,
3






-

w

3
,

{

3
,
4

}



<
1
>




ψ

f

,
4








-

w

4
,

{

1
,
4

}



<
1
>




ψ

f

,
1






-

w

4
,

{

2
,
4

}



<
1
>




ψ

f

,
2






-

w

4
,

{

3
,
4

}



<
1
>




ψ

f

,
3






-

v

4
,

{
4
}



<
1
>




ψ

f

,
4




]





[


Ψ
f

·
Q

]


𝒥

=





x

𝒥






(

-
1

)


i

n



d
𝒥

(
x
)



[


R
f

(
Q
)

]


x
,

𝒥

\


{
x
}





=

-


[


R
f

(
Q
)

]


x
,











(
12
)







However, after this modification, the child code Q is not a signed determinant code anymore, and the exact repair property may fail. The four coded symbols corresponding to the Q segment of a failed node f are given by








(









{
1
}



:







ψ

f
,
1




v

1
,

{
1
}




1




+


ψ

f
,
2




(


w

2
,

{

1
,
2

}




1



+
A

)


+


ψ

f
,
3




(


w

3
,

{

1
,
3

}




1



+
B

)


+


ψ

f
,
4




w

4
,

{

1
,
4

}




1
















{
2
}



:







ψ

f
,
2




v

2
,

{
2
}




1




+


ψ

f
,
1




w

1
,

{

1
,
2

}




1




+


ψ

f
,
3




(


w

3
,

{

2
,
3

}




1



+
C

)


+


ψ

f
,
4




w

4
,

{

2
,
4

}




1
















{
3
}



:







ψ

f
,
3




v

3
,

{
3
}




1




+


ψ

f
,
1




w

1
,

{

1
,
3

}




1




+


ψ

f
,
2




w

2
,

{

2
,
3

}




1




+


ψ

f
,
4




w

4
,

{

3
,
4

}




1
















{
4
}



:







ψ

f
,
4




v

4
,

{
4
}




1




+


ψ

f
,
1




w

1
,

{

1
,
4

}




1




+


ψ

f
,
2


(


w

2
,

{

2
,
4

}




1



+


ψ

f
,
3





w

3
,

{

3
,
4

}




1



.














On the other hand, the repair space of the failed node for the modified code Q, i.e., Rf(Q)=Q·Ξf,(1), is given by








R
f



(
Q
)


=










1




2







3







4





[












-

υ

1
,

{
1
}




1






ψ

f
,
1



-


w

1
,

{

1
,
2

}




1





ψ

f
,
2



-


w

1
,

{

1
,
3

}




1





ψ

f
,
3



-


w

1
,

{

1
,
4

}




1





ψ

f
,
4











-

(


w

2
,

{

1
,
2

}




1



+
A

)








ψ

f
,
1



-


υ

2
,

{
2
}




1





ψ

f
,
2



-


w

2
,

{

2
,
3

}




1





ψ

f
,
3



-


w

2
,

{

2
,
4

}




1





ψ

f
,
4















-

(


w

3
,

{

1
,
3

}




1



+
B

)








ψ

f
,
1



-


(


w

3
,

{

2
,
3

}




1



+
C

)







ψ

f
,
2



-


υ

3
,

{
3
}




1





ψ

f
,
3



-


w

3
,

{

3
,
4

}




1





ψ

f
,
4




_











-

w

4
,

{

1
,
4

}




1






ψ

f
,
1



-


w

4
,

{

2
,
4

}




1





ψ

f
,
2



-


w

4
,

{

3
,
4

}




1





ψ

f
,
3



-


υ

4
,

{
4
}




1





ψ

f
,
4







]






Comparing the symbols of the failed node against the entries of the repair space, it turns out that only [Ψf·Q]{4} can be found from −[Rf(Q)]4,ø, while there is a mismatch for the other three symbols, due to injections of A, B, and C. Let us compare [Ψf·Q]{1} and −[Rf(Q)]1,ø. Their difference is given by

f·Q]{1}+[Rf(Q)]1,øf,2A+ψf,3B.  (14)


Note that by setting

A=v4,{1,2,4}<0>, B=v4,{1,3,4}<0>, C=v4,{2,3,4}<0>


this difference reduces to ψf,2v4,{1,2,4}<0>f,3v4,{1,3,4}<0>, which is exactly the entry at position (4, {1,4}) of Rf(P), the repair space of the parent code. That is, the missing symbol of the failed node in its Q segment can be retrieved using the repair space of Q and P. It is easy to check that with the same assignment of A, B, and C given in (14), all the four symbols of the failed node in its Q segment can be recovered through

f·Q]{1}=−[Rf(Q)]1,ø+[Rf(P)]4,{1,4},
f·Q]{2}=−[Rf(Q)]2,ø+[Rf(P)]4,{2,4},
f·Q]{3}=−[Rf(Q)]3,ø+[Rf(P)]4,{3,4},
f·Q]{4}=−[Rf(Q)]4,ø.


In summary, this example showed that (1) for a system with k<d, the data recovery of the (signed) determinant codes fails; (2) the shortcoming in data recovery can be fixed by concatenating new determinant code (called the child code) with the original one (called the parent code), and providing a backup copy for the missing symbols; this may not only fix the data recovery from the systematic nodes, but also for any set of k nodes access by the data collector; (3) the usual repair of the child code may fail due to the injected symbols; and finally (4) this repair can be also fixed by the help of the repair space of the parent code.


In this section, the techniques described herein describe the construction for an (n, k, d) exact-regenerating code. There are k different regenerating codes for a distributed storage system with parameters (n, k, d), operating at different storage vs. repair bandwidth trade-off. The techniques described herein enumerate them by a parameter μ∈{1,2, . . . , k}, which is called the mode of the code. The general framework to construct an (n, k, d; μ) exact-regenerating code is to start with a determinant code with parameters (n′=n; k′=d, d′=d; m=y) (or simply a (d; μ) determinant code), and modify it to an (n, k, d; μ) code for a given k<d.


Consider a (d; μ) determinant code with encoder matrix Ψ and message matrix D. Without loss of generality, the techniques described herein can assume the code is semi-systematic, i.e., Ψ has an identity matrix at its top k rows (see Definition 4), and thus the content of node i∈[k] is the same as the i-th row of the matrix D. From Proposition 1, data recovery can be performed from any k′=d nodes. Now, let k<d be a given parameter, and consider an attempt for data recovery from the first k nodes. It is clear that some symbol in rows [k+1: d] of matrix D cannot be recovered, since they do not appear in the content of the first k nodes. The techniques described herein refer to such symbols as missing symbols. The main idea to overcome this limitation is to protect such missing symbols in data recovery, by providing a backup copy of them in the top k rows. The main challenge is to introduce such backup copies such that they do not demolish the repair process.



FIG. 5 is a conceptual diagram illustrating cascading of determinant codes, in accordance with the techniques described herein.


In this figure, every rectangle represents the message matrix of one determinant code, and tm denotes the number of message matrices of mode m, for m∈{0,1, . . . , μ}. The message matrix at mode m is of size d×αm, where








α
m

=

(

d
m

)


.





These message matrices are placed from the highest mode to the lowest mode. The leftmost message matrix corresponds to the root of the cascade code with mode m=μ. The rightmost message matrices have either a mode of m=1 or m=0. In the message matrices M, some of the symbols in the bottom d−k rows are missing. These missing symbols are backed up by begin injected (adding with a sign) into the parity symbols located at the top k rows of other determinant codes with lower modes. Injections are demonstrated by arrows from a symbol in the parent matrix to a parity symbol in the child matrix.


To this end, the techniques described herein use multiple instances of (d; m) determinant codes with different modes m∈{0, 1, . . . , μ}, and concatenate their message matrices to obtain a super-message matrix, as shown in FIG. 5. This super-message matrix then may be multiplied (from left) by an encoder matrix Ψ to generate the node contents. Therefore, the codewords (the content of the nodes) may be also a concatenation of the codewords of the determinant codes used as building blocks. Recall that there are redundant symbols (see the parity equation in (7)) in each message matrix used in the concatenation. While such redundant symbols play a critical role in the repair process, they have no contribution in the data recovery. The main purpose of concatenating codes is to utilize such redundant symbols to store a backup copy of the missing symbols in the lower (d−k) rows. More precisely, the techniques described herein inject (add to the existing symbol) a missing symbol from the bottom (d−k) rows of a code matrix at mode m1 (on outer code or parent code) to a parity symbol in top k rows of another code matrix at mode m2 (in inner code or child code), where m2<m1. However, the missing symbols of the inner code with mode m2 may be also backed up by injection into another code with lower mode m3<m2. This leads to a cascade of determinant codes. The details of cascading are discussed in the following sections.


Definition 2: The encoder matrix Ψ for a code with parameters (n, k, d) is defined as an n×d matrix

Ψn×d=[Γn×kn×(d−k)],


such that

    • any k rows of Γ are linearly independent; and
    • any d rows of Ψ are linearly independent.


Note that Vandermonde matrices satisfy both properties. Similar to determinant codes, the super-message matrix of the cascade codes may be multiplied (from left) by an encoder matrix to generate the encoded content of the nodes.


The techniques described herein describe the construction of the super-message matrix in this section. As mentioned before, M can be obtained in two stages, namely,

    • Concatenating several determinant codes, including custom characterm copies of the codes of mode m, for m∈{0,1, . . . , μ}. The values of custom characterm's are determined later as given in (33);


Injecting the missing symbols at the bottom (d−k) rows of a code at mode j (the parent code) to the parity symbols at the top k rows of a code at mode m (a child code), where m<j.


The techniques described herein refer to the determinant code message matrices used in a super-message matrix as code segments. Similarly, the codewords comprise of multiple codeword segment, each corresponding to one code segment. Each code segment at mode m is originally a determinant code at mode m, namely D(m). The techniques described herein start with a root code, which is a determinant code with parameters (d; μ), and in each stage introduce other determinant codes with a lower mode, into which injection is performed. The ultimate super-message matrix may be obtained by concatenating all such injected message matrices of determinant codes.


The details of symbol injection are discussed in this section. The missing symbols at the lower (d−k) rows of a determinant code segment may be treated differently.


Definition 3: Consider an (n, d, d; j) determinant code at mode j with message matrix P. Recall from (6) that symbols in P are of the forms either custom character with |custom character|=j or custom character with |custom character|=j+1. Therefore, the symbols in P (lower (d−k) rows of P can be divided into four groups:









(
P
)


=



{


±


v

x
,
𝒳


:

x


𝒳
_




,


𝒳
_

=



}



{


±


w

x
,


:

x




=


}


=

{


P

x
,
𝒳


:

𝒳


[


k
+
1

:
d

]



}



,



·



(
P
)


=


{


±


w

x
,


:

x




,




,



ind

(
x
)

=

j
+
1



}

=

{



P

x
,
𝒳


:

𝒳


[


k
+
1

:
d

]



,

x
>

max

𝒳



}



,



(
P
)


=


{


±


v

x
,
𝒳


:

x


𝒳
_




,


𝒳
_

=



}

=

{



P

x
,
𝒳


:

𝒳


[


k
+
1

:
d

]



,

x

𝒳


}



,



(
P
)


=


{


±


w

x
,


:

x




,




,



ind

(
x
)

<

j
+
1



}

=


{



P

x
,
𝒳


:

𝒳


[


k
+
1

:
d

]



,

x

𝒳

,

x
<

max

𝒳



}

.







The techniques described herein treat symbols in the above-mentioned four groups as follows:


Symbols in custom character(P) may be set to zero (nulled). This yields a reduction in Fj, the number of data symbols stored in the code segment.


Symbols in custom character(P) are essentially parity symbols and can be recovered from the symbols in P and the parity equation (7). Therefore, the techniques described herein do not need to protect them by injection.


Each symbol in custom character(P)∪custom character(P) may be injected into a redundant parity symbol in the top k rows of the message matrix of a determinant code with lower modes. Those codes used for injection are called child codes of P.


Remark 3: For a signed determinant code P with mode(P)=j the number of symbols in custom character(P) to be protected can be found from














(
P
)




=



{




(

x
,

)



:






_






k
+
1

:
d

]


,









=
j

,

x



}









=




u
=
1


j
-
1






{




(

x
,

)



:







=
u

,





=

j
-
u


,

x



}










=




u
=
1


j
-
1





(



k




u



)



(




d
-
k






j
-
u




)




(

j
-
u

)

.










Similarly, for custom character
















4



(
P
)




=





{







(
)



:





[


k
+
1

;
d

]


,





=







j
,

x




[

k
+

1


:






d


]


\


<

max












}









=





{




(

x
,

)



:









[

k
+

1


:






d


]


,





=

j
+
1


,

x



\


{

max






}




}









=






u
=
1

j





{




(

x
,

)

:



_




=
u

,





=

j
+
1
-
u


,

x



\


{

max


}




}










=






u
=
1

j




(



k




u



)



(




d
-
k






j
+
1
-
u




)



(

j
-
u

)










(
15
)







where in (15), custom character=custom character∪{x}, with |custom character|=j+1. This implies there is a total of














3



(
P
)





4



(
P
)





=




u
=
1


j
-
1





(



k




u



)



(




d
-
k
+
1






j
-
u
+
1




)



(

j
-
u

)







(
16
)







symbols in a signed determinant code of mode j to be injected.


For each injection, the techniques described herein refer to the code whose symbols in the lower part need protection as the parent code. Furthermore, the code into which symbols are injected is called the child code.


Remark 4: Note that the techniques described herein may need multiple child codes for a parent code to protect all the symbols in custom character(P)∪custom character(P). However, for each child code, there is only one parent code that injects symbols into it. This leads to the tree structure for the injection hierarchy.


Consider a parent code with message matrix P, which is a (d; j) signed determinant code with mode j. The techniques described herein introduce several child codes for P to inject its symbols into. The techniques described herein distinguish these child codes by a pair (b, custom character), satisfying

b∈[k+1:d],
custom character⊆[k+1: d],
b≤max custom character.


The techniques described herein refer to (b, custom character) as the injection pair. Furthermore, the techniques described herein denote tha child code of P code associated with an injection pair (b, custom character) by






Q


b
,




P






(or simply Q, whenever it does it cause a confusion) which is a signed determinant code with parameters (d; m) where its mode satisfies

mode(Q)=mode(P)−|custom character|−1.












σ
Q

(
i
)

=

1
+


σ
P

(
i
)

+


ind




[
i
]



(
i
)



,



i


[
d
]



,




(
17
)







where μP(i) are the sign parameters of the parent codes.


Symbols of code P may be injected into the message matrices of its child codes, i.e., added to (some of) the entries of the message matrix







Q


b
,




P



.





The techniques described herein denote the modified version (after injection) of child code






Q


b
,




P






by







Q


b
,




P



.





This modification is modeled by adding the matrix of injection symbols






Δ


b
,




P







to the original






Q


b
,




P






matrix, i.e.,










Q




b
,



P


=


Q




b
,



P


+


Δ




b
,



P


.






(
18
)







Here






Δ


b
,




P






is a matrix with the same size as Q, that specifies the difference between the before- and after-injection. For a code with injection pair (b, custom character), the symbol injected into row i and column custom character is given by











[

Δ




b
,









P


]


i
,


=

(





(

-
1

)









if





i

>

,

i



,





0



otherwise
.









(
19
)







Here, the coordinates of injection satisfy i∈[d] and custom character⊆[d] with |custom character|=m=mode(Q).


Note that symbols of the code Q may also need to be injected into the message matrix of another code, which is a child code of Q (and hence grandchild code of P).


Hence, injections may be performed in a cascade manner. The ultimate super-message matrix may be obtained by concatenating all modified (after injection) message matrices.


The following remarks are provided for a comprehensive description of the injection process. However, they do not play a critical role in the proofs, and can be skipped.


Remark 5: The injection equation in (19) specifies the injection from the child code's perspective, i.e., for a given entry (row i and column custom character) of a child code message matrix with injection pair (b, custom character), the equation determines whether an injection into this entry is performed, and specifies the to-be-injected entry of the parent code. This representation is the most compact and useful form to prove the fundamental properties of the code. However, it is more intuitive to describe the injection from the parent code's point of view, that is, for every symbol in groups custom character(P)∪custom character(P) of a parent code P, specify the position(s) and parameters of child code(s) to which that symbol may be injected. In the following, the techniques described herein specify injection equations from the child code's perspective. It is worth mentioning that a single symbol might be injected to several child codes. One of such injections is called primary and the rest are called secondary.



FIG. 6 is a conceptual diagram illustrating symbol injection, in accordance with the techniques of this disclosure. The shaded cells in the column indicate the set custom character for an element (x, custom character). The primary injection for a v-symbols in custom character3 where x∈custom character in shown in (I). The coordinates of the injection are given by (y, custom character)=(maxcustom character\{maxcustom character}). Therefore, the child code is a determinant code of mode |custom character|=|custom character|, designated by an injection pair (b, custom character)=(x, custom character). The same v-symbol may be injected (perhaps multiple times) as secondary injections. As indicated in (II), the coordinates of a secondary injection are given by (y, custom character) where i can be any element of custom character, and custom character includes the entire custom character as well as an arbitrary subset of custom charactercustom character[k+1:y−1] (one injection for each pair of (y, custom character). The child code used for each such injection is a determinant code of mode |custom character|=|custom charactercustom character|, which is identified by an injection pair (b, custom character)=(x, custom character[k+1:d]\(custom character∪{y})).












Primary


injection


of


v
-
symbols
:







for


any



(

x
,
X

)



such


that



X
_







and


x




X
_







(


i
.
e
.

,


v

x
;
X





𝒢
3

(
P
)



)





P

x
,
X


=





(

-
1

)



σ
P

(
x
)




v

x
,
X






±
1




[

Q



x
,

X
_




P


]



max


X
_


,


X
_


\


{

max


X
_


}





=



(

-
1

)



σ
Q

(

max


X
_


)





w


max


X
_


,

X
_



(

x
,

X
_

,
P

)


.









(
20
)







Here Q is a signed determinant code of mode

mode(Q)=|custom character|−1,


and the superscript of the w-symbol in the parent code's message matrix is a tuple of the form (b, custom character, P) where (b, custom character) is the injection-pair associated to the child code, and P is the parent, for which code Q is introduced. Moreover ±1 on the arrow indicates that the injection takes place after a proper change of sign. Here, this sign is determined by







(

-
1

)


1
+


σ
P

(

max


𝒳
_


)

+


ind
𝒳

(

max


𝒳
_


)






More precisely, the symbol at row maxcustom character and column custom character of the child code may be modified from














[

Q



x
,

X
_




P


]



max


X
_


,


X
_


\


{

max


X
_


}




=



(

-
1

)



σ
Q

(

max


X
_


)




w


max


X
_


,

X
_



(

x
,

X
_

,
P

)




to










[

Q



x
,

X
_




P


]



max


X
_


,


X
_


\


{

max


X
_


}




=





(

-
1

)



σ
Q

(

max


X
_


)




w


max


X
_


,

X
_



(

x
,

X
_

,
P

)



+











(

-
1

)


1
+


σ
Q

(

max


X
_


)

+


ind
x

(

max


X
_


)






(

-
1

)



σ
P

(
x
)





v

x
,
X


.

















Secondary injection of v-symbol: Beside the injections described in (I), a symbol custom character with custom character≠Ø and x∈custom character may be also injected as











P

x
,
X


=





(

-
1

)



σ
P

(
x
)




v

x
,
X






±
1




[

Q



x
,


X

[


y
+
1

:
d

]




𝒴






P


]


y
,


X
_


𝒴




=



(

-
1

)



σ
Q

(
y
)




w

y
,


X
_


𝒴


{
y
}




(

x
,


X

[


y
+
1

:
d

]




𝒴



,
P

)





,




(
21
)







for every y∈custom character and (custom character, custom character′) that partition custom character[k+1:y−1], that is, custom charactercustom character′=ø and custom charactercustom character′=custom character[k+1:y−1]. Moreover, such a secondary injection may be performed only if x≤max custom character[y+1:d]custom character′. The techniques described herein used custom character[k+1:y−1] to denote the set custom character∩[k+1: y−1], and custom character[y+1:d] to specify the set custom character∩[y+1:d]. The sign of injection is given by








(

-
1

)


1
+


σ
p

(
y
)

+


ind
x

(
y
)



.





Note that the mode of the child code is mode(Q)=|custom charactercustom character|.


Primary injection of w-symbols: for every pair (x, custom character) such that custom character≠ø, x∈[k+1:d]custom character and x<maxcustom charactercustom character∪∈custom character(P)),










P

x
,
X


=





(

-
1

)



σ
P

(
x
)




w

x
,

X


{
x
}








±
1




[

Q



x
,

X
_




P


]



max


X
_


,


X
_


\


{

max


X
_


}





=



(

-
1

)



σ
Q

(

max


X
_


)





w


max


X
_


,

X
_



(

x
,

X
_

,
P

)


.







(
22
)







Here, the sign of injected symbol is determined by







(

-
1

)


1
+


σ
P

(

max


𝒳
_


)

+


ind
𝒳

(

max


𝒳
_


)







and the mode of the child code is mode(Q)=custom character−1.


Secondary injection of w-symbols: Similar to the v-symbols, w-symbols may be also re-injected. Each custom character with custom character≠ø, x∈[k+1:d]custom character with x<maxcustom character, may be injected as











P

x
,
X


=





(

-
1

)



σ
P

(
x
)




w

x
,

X


{
x
}








±
1




[

Q



x
,


X

[


y
+
1

:
d

]




𝒴






P


]


y
,


X
_


𝒴




=



(

-
1

)



σ
Q

(
y
)




w

y
,


X
_


𝒴


{
y
}




(

x
,


X

[


y
+
1

:
d

]




𝒴



,
P

)





,




(
23
)








for every y∈custom character and (custom character, custom character) that partition custom character Moreover, such a secondary injection may be performed only if x·≤maxcustom charactercustom character. The sign of injection is given by








(

-
1

)


1
+


σ
p

(
y
)

+


ind
x

(
y
)



,





and the sign of the child code is given by mode(Q)=|custom charactercustom character|.


Remark 6: Note that each symbol custom charactercustom character3(P) may be injected multiple times, including one primary injection as specified in (20) and several secondary injections are given in (21). However, each secondary injection is performed into a different child code, as the injection-pairs

(b,custom character)=(x,custom character∪(custom character∩[y+1:d]))


are different for distinct pairs of (y, custom character), when custom charactercustom character∩[k+1: y−1].


The primary injection governed by (20) is an injection into the top k rows of the child code (since the row index of the child symbol w is y=maxcustom character∈[k]), and may be responsible for data recovery. Once such symbols are injected into a child code, the modified child code may not be a determinant code anymore, and its repair process may fail. The secondary injections of this symbol are proposed to maintain the repair structure. Note that in the injections specified in (21) the row index of the host symbol is y∈custom character⊆[k+1: d], i.e., the injections performed into the lower (d−k) rows of the child codes.


A similar argument holds for the injection of w-symbols, as the primary injection in (22) is into the top k rows of the host code message matrix data recovery shows the injection for recovery in order to provide a backup for data recovery, and the secondary injection in (23) specifies all injections for the repair into the lower (d−k) rows of various children codes.


It is straightforward to evaluate the number of child codes of mode m introduced for a parent code of mode j. First note that for a parent code P of mode j, all the child codes are determinant codes of the form







Q


b
,




P



,





into which missing symbols of P may be injected. The mode of such child code is given by

m=mode(Q)=mode(P)−|custom character|−1=j−|custom character|−1.


In order to count the number of possible injection pairs (b, custom character)'s the techniques described herein can distinguish two cases. First, if b∈custom character, then there are








(




d
-
k













)






choices of custom character, and there are |custom character| choices for b∈custom character. Second, if b∉custom character, then custom character∪{b} is a subset of [k+1: d] of size |custom character|+1. Then, there are








(




d
-
k











+
1




)






choices for custom character∪{b}, and for each choice, b can be any entry except the largest one.


Therefore the total number of injection pairs for a parent code of mode j is given by












(

j
-
m
-
1

)



(




d
-
k






j
-
m
-
1




)


+


(

j
-
m
-
1

)



(




d
-
k






j
-
m




)



=


(

j
-
m
-
1

)




(




d
-
k
+
1






j
-
m




)

.






(
24
)







Remark 7: Recall from (20) and (22) that each primary injection may be into a w-symbol in the child code at the position (maxcustom character,custom character\{maxcustom character}). The number of such w-symbols in a determinant code of mode m is only







(



k





m
+
1




)

,





since custom character⊆[k] and |custom character|=m+1. On the other hand, from (24), the number of child codes of mode m introduced for the primary injections of a parent code P of mode j. Therefore, the total number of w symbols hosting primary injections of a code of mode j is given by













m
=
0


j
-
1





(

j
-
m
-
1

)



(




d
-
k
+
1






j
-
m




)



(



k





m
+
1




)



,




(
25
)







which is exactly the same as |custom character(P)∪custom character(P)|, i.e., the number of symbols to be injected as evaluated in (16).


Remark 8. Consider a w-symbol in a child code hosting a symbol from the parent code. In primary injections of types I and III, such w-symbol is in the top k rows of the message matrix, and does not need protection. In secondary injections of types II and IV, the w-symbol is positioned in the bottom (d−k) rows. However, for such symbols









ind


𝒳
_




{
y
}



(
y
)

=

m
+
1


,




since custom character⊆[k], custom charactercustom character and y∈[k+1: d]. Hence, such symbol belongs to group custom character(Q) of the child code, and does not need to be injected into a grandchild code. The implication is that a w-symbol in a child code hosting a symbol from the parent code is never injected into a grandchild code, and therefore injected symbols may not be carried for multiple hops of injections. Moreover, the specification of injections form a parent code P into a child code Q does not depend on whether P is already modified by an earlier injection or not, that is,







Δ


b
,




P



=


Δ


b
,




P



.





The main property of the code proposed in this section is its reliability in spite of at most n−k node failures The following proposition is a formal statement for this property.


Proposition 4: By accessing any k nodes, the content of the stored data file can be recovered.


In the following, the techniques described herein discuss the exact repair property by introducing the repair data that a set of helper nodes custom character with |custom character|=d sent in order to repair a failed node f∈[n].


The repair process is performed in a recursive manner from top-to-bottom, i.e., from segments of the codeword of node f with the highest mode to those with the lowest mode.


The repair data sent from a helper node h∈custom character to the failed node f is simply the concatenation of the repair data to be sent for each code segment. The repair data for each code segment can be obtained by treating each segment as a ordinary signed determinant code. More precisely, helper node h sends









Pisacodesegment



{


Ψ
h

·
P
·

Ξ

f
·

(

mode


(
P
)


)




}


,




where Ψh·P is the codeword segment of node h corresponding to code segment P. In other words, for each codeword segment, the helper node needs to multiply it by the repair encoder matrix of the proper mode, and send the collection of all such multiplications to the failed node. Recall from Proposition 3 that the rank of Ξf,(P) for a code segment of mode m is only







β

(
m
)


=


(




d
-
1






m
-
1




)

.






The overall repair bandwidth of the code is evaluated in (41).


Upon receiving all the repair data ∪Ph·P·μf,(mode(P)): h∈custom character}, the failed node stacks the segments corresponding to each code segment P to obtain

Ψ[custom character,:]·P·Ξf,(mode(P)),


from which, the repair spaces can be retrieved:











R
f



(
P
)


=





Ψ


[

,
:

]



-
1


·

Ψ


[

,
:

]


·
P
·

Ξ

f
,

(

mode


(
P
)


)











=



P
·

Ξ

f
,

(

mode


(
P
)


)





,











P


:







codesegments
.










Note that invertibility of Ψ[custom character, :] is guaranteed. Having the repair spaces for all the code segments, the content of node f can be reconstructed according to the following proposition.


Proposition 5: In the (n, k, d) proposed codes with parameters defined in this disclosure, for every failed node f∈[n] and set of helpers custom character⊆[n]\{f} with |custom character|=d, the content of node f can be exactly regenerated from the received repair spaces. More precisely. the symbols at position custom character in a codeword segment corresponding to a code Q operating at mode m may be repaired through











[


Ψ
f

·

Q


b
,




P




]

𝒥

=





x

𝒥







(

-
1

)




σ
Q



(
x
)


+


ind
𝒥



(
x
)






[


R
f



(
Q
)


]



x
,

𝒥

\


{
x
}





-

(





[


R
f



(
P
)


]


b
,

𝒥











if





𝒥




=


,





0



otherwise
,










(
26
)







where P is the parent code of Q.


Consider the construction of an (n, k, d) cascade code at mode μ, described in this disclosure. Let custom characterm be the number of (d; m) determinant codes of mode m needed to complete all the injections. The super-message matrix M is obtained by concatenating all code segments, which results in a matrix with d rows and a total of









m
=
0

μ





m



α
m







columns. Therefore, the per-node storage capacity of the resulting code may be










α


(

d
,

k
;
μ


)


=





m
=
0

μ





m



α
m



=




m
=
0

μ






m



(



d




m



)


.







(
27
)







Similarly, the repair bandwidth of the resulting code may be










β


(

d
,

k
;
μ


)


=





m
=
0

μ





m



β
m



=




m
=
0

μ






m



(




d
-
1






m
-
1




)


.







(
28
)







The total number of data symbols stored in matrix M is the sum of the number of data symbols in each code segment. A code segment of mode m can store up to







F
m

=

m


(




d
+
1






m
+
1




)







symbols. However, recall that data symbols in group custom character (see Definition 3) may be set to zero, which yields to a reduction in the of symbols. For an (n, d, d; m) determinant code used a code segment, the reduction due to nulling symbols in custom character is










N
m

=




1



=




{




:



=


}



+


m

m
+
1






{


:

=


}










(
29
)






=


m


(




d
-
k





m



)


+


m

m
+
1




(

m
+
1

)



(




d
-
k






m
+
1




)







(
30
)






=


m
[






(




d
-
k





m



)

+





(




d
-
k






m
+
1




)


]

=

m


(




d
-
k
+
1






m
+
1




)







(
31
)







where the coefficient






m

m
+
1






captures the fact that there are only m data symbols among m+1 w-symbols in each parity group. The number of possible custom character's is evaluated based on the facts that |custom character|=m and custom character∩[k]=ø, which implies custom character⊆[k+1:d] and there are








(




d
-
k





m



)






choices of custom character. Moreover, x∈custom character can be any of the m elements of custom character. A similar argument is used to compute the size of the second set. Finally, the techniques described herein used Pascal's identity. Therefore, the techniques described herein can evaluate the total number of data symbols in the super-message matrix as










F


(

d
,

k
;
μ


)


=





m
=
0

μ





m



(


F
m

-

N
m


)



=




m
=
0

μ





m




m


[


(




d
+
1






m
+
1




)

-

(




d
-
k
+
1






m
+
1




)


]


.








(
32
)







The rest of this section is dedicated to the evaluation of parameters custom characterm's, the number of code segments of mode m, in order to fully characterize the parameters of the code.


In this section, the techniques described herein derive a recursive relationship between custom characterm parameters. Next, the techniques described herein may solve the recursive equation to evaluate code parameters.


There is one root code with mode μ, i.e., custom characterμ=1. Let m be an integer in {0, 1, . . . , μ−1}. In general, any code of mode j>m needs child codes of mode m. Recall from (24) that the number of child codes of mode m required for injection from a parent code of mode j is given by







(

j
-
m
-
1

)




(




d
-
k
+
1






j
-
m




)

.






Moreover, parent codes do not share children. Therefore, the total number of required child codes of mode m is given by











m

=




j
=

m
+
1


μ






j



(

j
-
m
-
1

)





(




d
-
k
+
1






j
-
m




)

.







(
33
)







This is a reverse (top-to-bottom) recursive equation with starting point custom characterμ=1, and can be solved to obtain a non-recursive expression for custom characterm. Note that (33) potentially associates non-zero values to custom characterm's with m<0. However, the techniques described herein only consider m∈{0, 1, . . . , μ−1}.


Let μ be a fixed non-negative integer. Recall that sequence {custom characterm} is defined in (33) only for m∈{0, 1, . . . , μ}. The techniques described herein first expand the range of m to include all integers, by defining dummy variables {custom characterm: m<0 or m>μ} such that











m

=

{



1




m
=
μ

,









j
=

m
+
1


μ






j



(

j
-
m
-
1

)




(




d
-
k
+
1






j
-
m




)






m


μ
.










(
34
)







Note that this immediately implies custom characterm=0 for m>μ. The techniques described herein also define a sequence {pm}m =−∞ as pm=custom characterμ-m for all m∈custom character. The next proposition provides a non-recursive expression for the sequence {pm}m.


Lemma 1: The parameters in sequence {pm} can be found from








p
m

=




t
=
0

m





(

-
1

)

t




(

d
-
k

)


m
-
t




(




d
-
k
+
T
-
1





t



)




,




for 0≤m≤μ.


It is clear that the techniques described herein can immediately find a non-recursive expression for custom characterm using the fact that custom characterm=pμ-m. However, it turns out that such a conversion is not helpful, and the techniques described herein can continue with sequence pm.


In this section the techniques described herein show that the code parameters obtained in (27), (28), and (32) are equal to the those proposed in Theorem 1. The following lemma may be helpful to simplify the derivation.


Lemma 2: For integer numbers a, b∈custom character,









m
=

-













m



(




d
+
a






m
+
b




)


=




m
=

-
b


μ





(

d
-
k

)


μ
-
m





(




k
+
a






m
+
b




)

.







(
35
)








Node storage size:










α


(

d
,

k
;
μ


)


=





m
=
0

μ





m



α
m



=





m
=
0

μ





m



(



d




m



)



=




m
=
0

μ




p

m
-
μ




(



d




m



)









(
36
)











=




m
=

-








p

m
-
μ




(



d




m



)








(
37
)











=




m
=
0

μ





(

d
-
k

)


μ
-
m





(



k




m



)

.








(
38
)







Note that in (37) the techniques described herein used the fact that the summand is zero for








m
<

0





since






(



d




m



)



=
0

,





and for m>μpm-μ=0, and (38) follows from Lemma 2 for a=b=0.


Repair bandwidth:










β


(

d
,

k
;
μ


)


=





m
=
0

μ





m



β
m



=





m
=
0

μ





m



(




d
-
1






m
-
1




)



=




m
=
0

μ




p

m
-
μ




(




d
-
1






m
-
1




)









(
39
)











=




m
=

-








p

m
-
μ




(




d
-
1






m
-
1




)








(
40
)











=




m
=
1

μ





(

d
-
k

)


μ
-
m




(




k
-
1






m
-
1




)








(
41
)











=




m
=
0

μ





(

d
-
k

)


μ
-
m





(




k
-
1






m
-
1




)

.








(
42
)







Here the steps are similar to (37), and (41) is due to Lemma 2 for a=b=−1.


File size:













F


(

d
,

k
;
μ


)


=






m
=
0

μ



m








m

[






(




d
+
1






m
+
1




)

-





(




d
-
k
+
1






m
+
1




)


]









=






m
=

-









m



[


m




(




d
+
1






m
+
1




)

-

m




(




d
-
k
+
1






m
+
1




)


]










(
43
)






=





m
=

-








p

μ
-
m




[



(

m
+
1

)



(




d
+
1






m
+
1




)


-

(




d
+
1






m
+
1




)

-


(

m
+
1

)



(




d
-
k
+
1






m
+
1




)


+

(




d
-
k
+
1






m
+
1




)


]







=




m
=

-








p

μ
-
m




[



(

d
+
1

)



(



d




m



)


-

(




d
+
1






m
+
1




)

-


(

d
-
k
+
1

)



(




d
-
k





m



)


+

(




d
-
k
+
1






m
+
1




)


]








(
44
)






=



(

d
+
1

)






m
=
0

μ





(

d
-
k

)


μ
-
m




(



k




m



)




-




m
=

-
1


μ





(

d
-
k

)


μ
-
m




(




k
+
1






m
+
1




)



-


(

d
-
k
+
1

)






m
=
0

μ





(

d
-
k

)


μ
-
m




(




k
-
k





m



)




+




m
=

-
1


μ





(

d
-
k

)


μ
-
m




(




k
-
k
+
1






m
+
1




)








(
45
)






=



(

d
+
1

)






m
=
0

μ





(

d
-
k

)


μ
-
m




(



k




m



)




-




m
=

-
1


μ





(

d
-
k

)


μ
-
m




(




k
+
1






m
+
1




)



-





[


(

d
-
k
+
1

)




(

d
-
k

)

μ


]

+

[



(

d
-
k

)


μ
+
1


+


(

d
-
k

)

μ


]






=



(

d
+
1

)






m
=
0

μ





(

d
-
k

)


μ
-
m




(



k




m



)




-




m
=

-
1


μ





(

d
-
k

)


μ
-
m




(



k




m



)



-




m
=

-
1


μ





(

d
-
k

)


μ
-
m




(



k





m
+
1




)











(
46
)











=



d





m
=
0

μ





(

d
-
k

)


μ
-
m




(



k




m



)




-




m
=
0


μ
+
1






(

d
-
k

)


μ
+
1
-
m




(



k




m



)












=





m
=
0

μ





k


(

d
-
k

)



μ
-
m








(



k




m



)



-


(



k





μ
+
1




)

.








(
47
)







Note that (43) holds since









m

=



0





for





m

>

μ






and




[


(




d
+
1






m
+
1




)

-

(




d
-
k
+
1






m
+
1




)


]



=


0





for





m

<
0



,





in (44) the techniques described herein used custom characterm=pμ-m and some combinatorial manipulation to expand the coefficient, and (45) follows from four times evaluation of Lemma 2 for (a, b)=(0,0), (a, b)=(1,1), (a, b)=(−k, 0), and (a, b)=(−k+1,1). The equality in (46) holds since the terms in the third summation in (45) are zero except for m=0, and similarly the terms in the forth summation are zero except for m=−1,0. Finally, the second summation in (47) is absorbed in the first summation by noting that








(



k





-
1




)

=
0

,





and the third summation is rewritten after a change of variable.


This section is dedicated to constructing an example of the proposed codes, and showing its fundamental properties. The techniques described herein may demonstrate all the ingredients of the construction, including encode matrix, code signature, concatenation of code segments, the grouping of the symbols in the lower (d−k) rows, and primary and secondary injection of v and w symbols.


Consider an (n, k=4, d=6; μ=4) code with parameters (α, β, F)=(81,27,324), as indicated by (3). Note that a code with parameters (α, β, F)=(81,27,324) is indeed an MSR code since F=kα and β=α/(d−k+1), for which several code constructions are known. However, while existing codes can be constructed for a given n (the number of nodes in the system) and node contents and parameters of the codes are functions of n, in the construction n is an arbitrary integer number, that only affects the field size (the field size may be greater than n). Nevertheless, this example is rich enough to demonstrate the techniques needed for the code construction as well as the formal proofs.


The first ingredient is an n×d encoder matrix with the properties in Definition 2. For the sake of illustration, the techniques described herein choose n=9 here, but the generalization of this encoder matrix to any integer n>d=6 is straightforward. The techniques described herein pick a Vandermonde matrix of size n×d=9×6. The underlying finite field may have at least n distinct elements. So, the techniques described herein pick custom character13, and all the arithmetic operations below are performed in modulus 13.







Ψ

9
×
6


=


[




Ψ
1






Ψ
2






Ψ
3






Ψ
4






Ψ
5






Ψ
6






Ψ
7






Ψ
8






Ψ
9




]

=


[




ψ

1
,
1





ψ

1
,
2





ψ

1
,
3





ψ

1
,
4





ψ

1
,
5





ψ

1
,
6







ψ

2
,
1





ψ

2
,
2





ψ

2
,
3





ψ

2
,
4





ψ

2
,
5





ψ

2
,
6







ψ

3
,
1





ψ

3
,
2





ψ

3
,
3





ψ

3
,
4





ψ

3
,
5





ψ

3
,
6







ψ

4
,
1





ψ

4
,
2





ψ

4
,
3





ψ

4
,
4





ψ

4
,
5





ψ

4
,
6







ψ

5
,
1





ψ

5
,
2





ψ

5
,
3





ψ

5
,
4





ψ

5
,
5





ψ

5
,
6







ψ

6
,
1





ψ

6
,
2





ψ

6
,
3





ψ

6
,
4





ψ

6
,
5





ψ

6
,
6







ψ

7
,
1





ψ

7
,
2





ψ

7
,
3





ψ

7
,
4





ψ

7
,
5





ψ

7
,
6







ψ

8
,
1





ψ

8
,
2





ψ

8
,
3





ψ

8
,
4





ψ

8
,
5





ψ

8
,
6







ψ

9
,
1





ψ

9
,
2





ψ

9
,
3





ψ

9
,
4





ψ

9
,
5





ψ

9
,
6





]

=




[







1


1


1


1


1


1




1


2


4


8


3


6




1


3


9


1


3


9




1


4


3


12


9


10




1


5


12


8


1


5




1


6


10


8


9


2




1


7


10


5


9


11




1


8


12


5


1


8




1


9


3


1


9


3



]








(

mod





13

)

.









In order to design the super-message matrix M, the techniques described herein start with an (n, k=6, d=6; m=4). Throughout the construction, the techniques described herein may introduce child codes for each code segment until the techniques described herein complete all the injections, and provide a backup copy for every symbol in groups of custom charactercustom character of each segment message matrix. This leads to a concatenation of multiple signed determinant codes with various modes. The number of code segments at each mode can be found from (33), that is (custom character4, custom character3, custom character2, custom character1, custom character0)=(1,0,3,2,9). Therefore, the resulting super-message matrix code M can be obtained from cascading a total of Σm=0μcustom characterm=15 code segments. For the sake of illustration the techniques described herein denote these code segments by T0, T1, . . . , T14, that is,

M=[T0T1 . . . T14],  (48)


in which code segment Ti is the modified message matrix (after injection) of an (n, d, d; mi) signed determinant code with mode mi=mode(Ti), where

(m0,m1, . . . ,m14)=(4,2,2,2,1,1,0,0,0,0,0,0,0,0,0).


The techniques described herein also use superscript custom charactericustom character to distinguish the data symbols of the code segment Ti. The hierarchy of parents/children codes and symbol injection is shown in FIG. 7.



FIG. 7 is a conceptual diagram illustrating the hierarchical tree for an (n, k=4, d=6; μ=4) code. Each level on the tree shows codes with the same mode. The injection from the parent codes to child codes are shown by arrows, labeled by the corresponding injection pair.


The first code segment is a (d=6; m=4) signed determinant code, with message matrix T0 and signature σT0(x)=0 for every x∈[d], i.e., σT0=(0,0,0,0,0,0). Note that no injection takes place into T0, and hence, T0=T0, i.e., T0 is a purely signed determinant message matrix of mode m0=4. The size of this matrix is







d
×

α

m
0



=


d
×

(



d





m
0




)


=


6
×

(



6




4



)


=

6
×
1


5
.









This matrix is given in (49).

















{

1
,
2
,
3
,
4

}







{

1
,
2
,
3
,
5

}






{

1
,
2
,
3
,
6

}






{

1
,
2
,
4
,
5

}







{

1
,
2
,
4
,
6

}






{

1
,
2
,
5
,
6

}








𝕋
0

=





1






2






3






4






5






6




[




v

1
,

{

1
,
2
,
3
,
4

}



<
0
>





v

1
,

{

1
,
2
,
3
,
5

}



<
0
>





v

1
,

{

1
,
2
,
3
,
6

}



<
0
>











v

1
,

{

1
,
2
,
4
,
5

}



<
0
>





v

1
,

{

1
,
2
,
4
,
6

}



<
0
>





v

1
,

{

1
,
2
,
5
,
6

}



<
0
>













v

2
,

{

1
,
2
,
3
,
4

}



<
0
>





v

2
,

{

1
,
2
,
3
,
5

}



<
0
>





v

2
,

{

1
,
2
,
3
,
6

}



<
0
>











v

2
,

{

1
,
2
,
4
,
5

}



<
0
>





v

2
,

{

1
,
2
,
4
,
6

}



<
0
>





v

2
,

{

1
,
2
,
5
,
6

}



<
0
>













v

3
,

{

1
,
2
,
3
,
4

}



<
0
>





v

3
,

{

1
,
2
,
3
,
5

}



<
0
>





v

3
,

{

1
,
2
,
3
,
6

}



<
0
>











w

3
,

{

1
,
2
,
3
,
4
,
5

}



<
0
>





w

3
,

{

1
,
2
,
3
,
4
,
6

}



<
0
>





w

3
,

{

1
,
2
,
3
,
5
,
6

}



<
0
>













v

4
,

{

1
,
2
,
3
,
4

}



<
0
>





w

4
,

{

1
,
2
,
3
,
4
,
5

}



<
0
>





w

4
,

{

1
,
2
,
3
,
4
,
6

}



<
0
>











v

4
,

{

1
,
2
,
4
,
5

}



<
0
>





v

4
,

{

1
,
2
,
4
,
6

}



<
0
>





w

4
,

{

1
,
2
,
4
,
5
,
6

}



<
0
>













w

5
,

{

1
,
2
,
3
,
4
,
5

}



<
0
>





v

5
,

{

1
,
2
,
3
,
5

}



<
0
>





w

5
,

{

1
,
2
,
3
,
5
,
6

}



<
0
>











v

5
,

{

1
,
2
,
4
,
5

}



<
0
>





w

5
,

{

1
,
2
,
4
,
5
,
6

}



<
0
>





v

5
,

{

1
,
2
,
5
,
6

}



<
0
>













w

6
,

{

1
,
2
,
3
,
4
,
6

}



<
0
>





w

6
,

{

1
,
2
,
3
,
5
,
6

}



<
0
>





v

6
,

{

1
,
2
,
3
,
6

}



<
0
>











w

6
,

{

1
,
2
,
4
,
5
,
6

}



<
0
>





v

6
,

{

1
,
2
,
4
,
6

}



<
0
>





v

6
,

{

1
,
2
,
5
,
6

}



<
0
>


















(
49
)
















{

1
,
3
,
4
,
5

}







{

1
,
3
,
4
,
6

}






{

1
,
3
,
5
,
6

}






{

1
,
4
,
5
,
6

}







{

2
,
3
,
4
,
5

}






{

2
,
3
,
4
,
6

}











v

1
,

{

1
,
3
,
4
,
5

}



<
0
>





v

1
,

{

1
,
3
,
4
,
6

}



<
0
>





v

1
,

{

1
,
3
,
5
,
6

}



<
0
>











v

1
,

{

1
,
4
,
5
,
6

}



<
0
>





w

1
,

{

1
,
2
,
3
,
4
,
5

}



<
0
>





w

1
,

{

1
,
2
,
3
,
4
,
6

}



<
0
>













w

2
,

{

1
,
2
,
3
,
4
,
5

}



<
0
>





w

2
,

{

1
,
2
,
3
,
4
,
6

}



<
0
>





w

2
,

{

1
,
2
,
3
,
5
,
6

}



<
0
>











v

2
,

{

1
,
2
,
4
,
5
,
6

}



<
0
>





v

2
,

{

2
,
3
,
4
,
5

}



<
0
>





v

2
,

{

2
,
3
,
4
,
6

}



<
0
>













v

3
,

{

1
,
3
,
4
,
5

}



<
0
>





v

3
,

{

1
,
3
,
4
,
6

}



<
0
>





v

3
,

{

1
,
3
,
5
,
6

}



<
0
>











w

3
,

{

1
,
3
,
4
,
5
,
6

}



<
0
>





v

3
,

{

2
,
3
,
4
,
5

}



<
0
>





v

3
,

{

2
,
3
,
4
,
6

}



<
0
>













v

4
,

{

1
,
3
,
4
,
5

}



<
0
>





v

4
,

{

1
,
3
,
4
,
6

}



<
0
>





w

4
,

{

1
,
3
,
4
,
5
,
6

}



<
0
>











v

4
,

{

1
,
4
,
5
,
6

}



<
0
>





v

4
,

{

2
,
3
,
4
,
5

}



<
0
>





v

4
,

{

2
,
3
,
4
,
6

}



<
0
>













v

5
,

{

1
,
3
,
4
,
5

}



<
0
>





w

5
,

{

1
,
3
,
4
,
5
,
6

}



<
0
>





v

5
,

{

1
,
3
,
5
,
6

}



<
0
>











w

5
,

{

1
,
4
,
5
,
6

}



<
0
>





v

5
,

{

2
,
3
,
4
,
5

}



<
0
>





w

5
,

{

2
,
3
,
4
,
5
,
6

}



<
0
>













w

6
,

{

1
,
3
,
4
,
5
,
6

}



<
0
>





v

6
,

{

1
,
3
,
4
,
6

}



<
0
>





v

6
,

{

1
,
3
,
5
,
6

}



<
0
>











v

6
,

{

1
,
4
,
5
,
6

}



<
0
>





w

6
,

{

2
,
3
,
4
,
5
,
6

}



<
0
>





v

6
,

{

2
,
3
,
4
,
6

}



<
0
>
























{

2
,
3
,
5
,
6

}







{

2
,
4
,
5
,
6

}






{

3
,
4
,
5
,
6

}











w

1
,

{

1
,
2
,
3
,
5
,
6

}



<
0
>





w

1
,

{

1
,
2
,
4
,
5
,
6

}



<
0
>





w

1
,

{

1
,
3
,
4
,
5
,
6

}



<
0
>







v

2
,

{

2
,
3
,
5
,
6

}



<
0
>





v

2
,

{

2
,
4
,
5
,
6

}



<
0
>





w

2
,

{

2
,
3
,
4
,
5
,
6

}



<
0
>







v

3
,

{

2
,
3
,
5
,
6

}



<
0
>





w

3
,

{

2
,
3
,
4
,
5
,
6

}



<
0
>





v

3
,

{

3
,
4
,
5
,
6

}



<
0
>







w

4
,

{

2
,
3
,
4
,
5
,
6

}



<
0
>





v

4
,

{

2
,
4
,
5
,
6

}



<
0
>





v

4
,

{

3
,
4
,
5
,
6

}



<
0
>







v

5
,

{

2
,
3
,
5
,
6

}



<
0
>





v

5
,

{

2
,
4
,
5
,
6

}



<
0
>





v

5
,

{

3
,
4
,
5
,
6

}



<
0
>







v

6
,

{

2
,
3
,
5
,
6

}



<
0
>





v

6
,

{

2
,
4
,
5
,
6

}



<
0
>





v

6
,

{

3
,
4
,
5
,
6

}



<
0
>






]




Note that horizontal line in the matrix separates the top k=4 rows from the bottom (d−k)=6−4=2 rows. The top 4×15 sub-matrix is denoted by T0 and the bottom 2×15 sub-matrix is denoted by T0.


According to Definition 3, the symbols in T0 can be grouped as follows.













𝒢
1



(

T
0

)


=
ϕ










𝒢
2



(

T
0

)


=





{


w

5
,

{

1
,
2
,
3
,
4
,
5

}




0



,





w

6
,

{

1
,
2
,
3
,
4
,
6

}




0



,





w

6
,

{

1
,
2
,
3
,
5
,
6

}




0



,





w

6
,

{

1
,
2
,
4
,
5
,
6

}




0



,





w

6
,

{

1
,
3
,
4
,
5
,
6

}




0



,





w

6
,

{

2
,
3
,
4
,
5
,
6

}




0








}






,











𝒢
3



(

T
0

)


=





{





v

5
,

{

1
,
2
,
3
,
5

}




0



,

v

5
,

{

1
,
2
,
4
,
5

}




0



,

v

5
,

{

1
,
2
,
5
,
6

}




0



,

v

5
,

{

1
,
3
,
4
,
5

}




0



,

v

5
,

{

1
,
3
,
5
,
6

}




0



,







v

5
,

{

1
,
4
,
5
,
6

}




0



,

v

5
,

{

2
,
3
,
4
,
5

}




0



,

v

5
,

{

2
,
3
,
5
,
6

}




0



,

v

5
,

{

2
,


4
,


5

,
6

}




0



,

v

5
,

{

3
,
4
,
5
,
6

}




0



,







v

6
,

{

1
,
2
,
3
,
6

}




0



,

v

6
,

{

1
,
2
,
4
,
6

}




0



,

v

6
,

{

1
,
2
,
5
,
6

}




0



,

v

6
,

{

1
,
3
,
4
,
6

}




0



,

v

6
,

{

1
,
3
,
5
,
6

}




0



,







v

6
,

{

1
,
4
,
5
,
6

}




0



,

v

6
,

{

2
,
3
,
4
,
6

}




0



,

v

6
,

{

2
,
3
,
5
,
6

}




0



,

v

6
,

{

2
,
4
,
5
,
6

}




0



,

v

6
,

{

3
,
4
,
5
,
6

}




0







}


,







𝒢
4



(

T
0

)


=






{


w

5
,

{

1
,
2
,
3
,
5
,
6

}




0



,

w

5
,

{

1
,
2
,
4
,
5
,
6

}




0



,

w

5
,

{

1
,
3
,
4
,
5
,
6

}




0



,

w

5
,

{

2
,
3
,
4
,
5
,
6

}




0




}

.






Since custom character(T0)=ø, no symbol may be set to zero. Symbols in custom character(T0) can be recovered from the parity equations. For instance, for w5,{1,2,3,4,5}<0>custom character(T0), from (7)−w1,{1,2,3,4,5}<0>+w2,{1,2,3,4,5}<0>−w3,{1,2,3,4,5}<0>+w4,{1,2,3,4,5}<0>−w5,{1,2,3,4,5}<0>=0,


in which all the symbols except w5,{1,2,3,4,5}<0> are located in the top k=4 rows of the matrix T0.


The symbols in groups custom character(T0) and custom character(T0) are marked in boxes in (49), to indicate that they need to be injected into other code segments with lower modes. Moreover, two groups of injections are designated by colored boxes, namely, symbols in a first group of boxes may be injected into T2 and those in a second group of boxes may be injected into T5. The details of the injection are discussed in the following.


Consider symbol custom charactercustom character(T0) with x=6 and custom character={1,2,5,6}, which implies custom character={1,2} and custom character={5,6}. According to (20), this symbol is primarily injected into a code of mode m=|custom character|−1=1, with injection pair (x, custom character)=(6, {5,6}). The row of the injection may be maxcustom character=max{1,2}=2 and its column is custom character\{maxcustom character}={1,2}\{2}={1}. Finally, the sign of the injection is given by








(

-
1

)


1
+

σ



T
0

(

max


𝒳
_


)

+


ind
𝒳

(

max


𝒳
_


)





=



(

-
1

)


1
+
0
+
2


=



(

-
1

)

3

=

-
1.







This is formally written as









[

T
0

]


x
,
X


=



v

6
,

{

1
,
2
,
5
,
6

}




0







×

(

-
1

)






[

T

5



T
0



6
,

{

5
,
6

}




]


2
,

{
1
}




=




(

-
1

)



σ

T
5




(
2
)





w

2
,

{

1
,
2

}



(

6
,

{

5
,
6

}

,

T
0


)



=



(

-
1

)



σ

T
5




(
2
)





w

2
,

{

1
,
2

}




5







,




that is, −v6,{1,2,5,6}<0> may be added to the symbol at position (2, {1}) of the child code. This child code is originally a signed determinant code denoted by







T

5





6
,

{

5
,
6

}





T
0




,





into which (some of) the symbols of T0 may be injected. Note that







T
5

=

T

5





6
,

{

5
,
6

}





T
0








is message matrix for a signed determinant code of mode 1 with the injection pair (6,{5,6}), and hence has









(



d




m



)

=


(



6




1



)

=
6







columns

,





and its columns are labeled with subsets of {1,2,3,4,5,6} of size 1. As a short-hand notation, the techniques described herein denote this message matrix before injection by T5 and after injection by T5, as indicated in FIG. 7. Then







T
5

=


T
5

+



Δ


6
,

{

5
,
6

}






T
0


.







The signature of this child code can be found from (17) as

σT5=(2,2,2,2,2,3).  (50)


In summary, the primary injection of this symbol is given by

[T5]2,{1}=w2,{1,2}<5>−v6,{1,2,5,6}<0>.


The complete form of matrices







T

5





6
,

{

5
,
6

}





T
0






and



Δ




6
,

{

5
,
6

}





T
0









are given in (51) and (52).

















{
1
}








{
2
}








{
3
}







{
4
}








{
5
}








{
6
}









𝕋
5

=





1






2






3






4






5






6




[




v

1
,

{
1
}



<
5
>





w

1
,

{

1
,
2

}



<
5
>





w

1
,

{

1
,
3

}



<
5
>











w

1
,

{

1
,
4

}



<
5
>





w

1
,

{

1
,
5

}



<
5
>





w

1
,

{

1
,
6

}



<
5
>







w

2
,

{

1
,
2

}



<
5
>





v

2
,

{
2
}



<
5
>





w

2
,

{

2
,
3

}



<
5
>











w

2
,

{

2
,
4

}



<
5
>





w

2
,

{

2
,
5

}



<
5
>





w

2
,

{

2
,
6

}



<
5
>







w

3
,

{

1
,
3

}



<
5
>





w

3
,

{

2
,
3

}



<
5
>





v

3
,

{
3
}



<
5
>











w

3
,

{

3
,
4

}



<
5
>





w

3
,

{

3
,
5

}



<
5
>





w

3
,

{

3
,
6

}



<
5
>







w

4
,

{

1
,
4

}



<
5
>





w

4
,

{

2
,
4

}



<
5
>





w

4
,

{

3
,
4

}



<
5
>











v

4
,

{
4
}



<
5
>





w

4
,

{

4
,
5

}



<
5
>





w

4
,

{

4
,
6

}



<
5
>







w

5
,

{

1
,
5

}



<
5
>





w

5
,

{

2
,
5

}



<
5
>





w

5
,

{

3
,
5

}



<
5
>











w

5
,

{

4
,
5

}



<
5
>






v

5
,

{
5
}



<
5
>


=
0





w

5
,

{

5
,
6

}



<
5
>


=
0






-

w

6
,

{

1
,
6

}



<
5
>






-

w

6
,

{

2
,
6

}



<
5
>






-

w

6
,

{

3
,
6

}



<
5
>












-

w

6
,

{

4
,
6

}



<
5
>







-

w

6
,

{

5
,
6

}



<
5
>



=
0





-

v

6
,

{
6
}



<
5
>



=
0




]







(
51
)




















{
1
}








{
2
}








{
3
}








{
4
}




{
5
}




{
6
}













Δ

6
,

{

5
,
6

}





𝕋
0


=





1






2






3






4






5






6




[



0


0


0


0


0


0





-

v

6
,

{

1
,
2
,
5
,
6

}



<
0
>





0


0


0


0


0





-

v

6
,

{

1
,
3
,
5
,
6

}



<
0
>






-

v

6
,

{

2
,
3
,
5
,
6

}



<
0
>





0


0


0


0





-

v

6
,

{

1
,
4
,
5
,
6

}



<
0
>






-

v

6
,

{

2
,
4
,
5
,
6

}



<
0
>






-

v

6
,

{

3
,
4
,
5
,
6

}



<
0
>





0


0


0




0


0


0


0


0


0




0


0


0


0


0


0



]







(
52
)







As it is clear from (52), the group of symbols {v6,{1,2,5,6}<0>, v6,{1,3,5,6}<0>, v6,{1,4,5,6}<0>, v6,{2,3,5,6}<0>, v6,{2,4,5,6}<0>, v6,{3,4,5,6}<0>} of custom character(T0) may be all primarily injected into the symbols of the T5. Note that all the primary injections are performed in T5. Similarly, the symbols {v5,{1,2,5,6}<0>, v5,{1,3,5,6}<0>, v5,{1,4,5,6}<0>, v5,{2,3,5,6}<0>, v5,{2,4,5,6}<0>, v5,{3,4,5,6}<0>} may be injected into








T
4

=

T

4





5
,

{

5
,
6

}





T
0





,





another child code of T0 with injection pair (5, {5,6}).


Note that the entries in the last row of






T

5





6
,

{

5
,
6

}





T
0








in (51) have negative sign, due to (50). Moreover, from Definition 3









𝒢
1



(

T
5

)


=

{


v

5
,

{
5
}




5



,

-

v

6
,

{
6
}




5




,

w

5
,

{

5
,
6

}




5



,

-

w

6
,

{

5
,
6

}




5




,

}


,







𝒢
2



(

T
5

)


=

{





w

5
,

{

1
,
5

}




5



,

w

5
,

{

2
,
5

}




5



,

w

5
,

{

3
,
5

}




5



,

w

5
,

{

4
,
5

}




5



,







-

w

6
,

{

1
,
6

}




5




,

-

w

6
,

{

2
,
6

}




5




,

-

w

6
,

{

3
,
6

}




5




,

-

w

6
,

{

4
,
6

}




5








}


,







𝒢
3



(

T
5

)


=



𝒢
4



(

T
5

)


=

ϕ
.








The symbols in custom character1(T5) are marked in (51), and set to zero. It is worth mentioning that since custom character3(T5)=custom character4(T5)=ø, no further injection from T5 is needed, and hence T5 does not have any child (see FIG. 7).


Beside its primary injection to T5, the symbols in custom character3(T0) are also subject to secondary injection(s). Let us consider v6,{1,2,5,6}<0>. According to (21), a secondary injection of v6,{1,2,5,6}<0> is determined by an integer y∈custom character={5,6} and subsets (custom character, custom character):


For y=5, custom character[k+1:y−1]=custom character[5:4]=ø. Hence custom character=custom character=ø and x=6≤max(custom character[y+1:d]custom character)=max(custom character[6:6]∪ø)=6 is satisfied. Hence this injection is needed.


If y=6, then custom character[k+1:y−1]=custom character[5:5]={5}. Hence, either (custom character, custom character)=(ø, {5}) or (custom character, custom character)=({5}, ø). However, since custom character[y+1:d]=custom character[7:5]=ø, the condition x=6≤max(custom character[y+1:d]custom character) cannot be satisfied. Thus, there is no injection for y=6.


For the only secondary injection satisfying conditions of (21), i.e., y=5 and custom character=custom character=ø, the injection pair is given by

(x,custom character[y+1:d]custom character)=(x,custom character[6:6]∪ø)=(6,{6}),


and the mode of the child code is

m=|custom charactercustom character|=|{1,2}∪ø|=2.


This code is denoted by T2 in FIG. 7. Hence,








[

T
0

]


x
,
X


=



v

6
,

{

1
,
2
,
5
,
6

}




0







×


(

-
1

)


1
+


σ

T
0




(
5
)


+


ind

{

1
,
2
,
5
,
6

}




(
5
)









[

T

2



T
0



6
,

{
6
}




]


5
,

{

1
,
2

}




=



(

-
1

)



σ

T
2




(
5
)






w

5
,

{

1
,
2
,
5

}



(

6
,

{
6
}

,

T
0


)


.







Recall that T2 is a code of mode








m
2

=


2





with






(



d





m
2




)


=


(



6




2



)

=

15





columns




,





columns, each labeled by a subset of {1, . . . ,6} of size 2. The signature of T2 can be obtained from (17) as

σT2(i)=1+σT0(i)+ind{6}∪{i}(i)=2∀i∈{1,2,3,4,5,6}.


The complete representation of the message matrix







T
2

=

T

2



T
0



6
,

{
6
}









is given in (53).

















{

1
,
2

}








{

1
,
3

}








{

1
,
4

}







{

1
,
5

}








{

1
,
6

}








{

2
,
3

}









𝕋
2

=





1






2






3






4






5






6




[




v

1
,

{

1
,
2

}



<
2
>





v

1
,

{

1
,
3

}



<
2
>





v

1
,

{

1
,
4

}



<
2
>











v

1
,

{

1
,
5

}



<
2
>





v

1
,

{

1
,
6

}



<
2
>





w

1
,

{

1
,
2
,
3

}



<
2
>













v

2
,

{

1
,
2

}



<
2
>





w

2
,

{

1
,
2
,
3

}



<
2
>





w

2
,

{

1
,
2
,
4

}



<
2
>











w

2
,

{

1
,
2
,
5

}



<
2
>





w

2
,

{

1
,
2
,
6

}



<
2
>





v

2
,

{

2
,
3

}



<
2
>













w

3
,

{

1
,
2
,
3

}



<
2
>





v

3
,

{

1
,
3

}



<
2
>





w

3
,

{

1
,
3
,
4

}



<
2
>











w

3
,

{

1
,
3
,
5

}



<
2
>





w

3
,

{

1
,
3
,
6

}



<
2
>





v

3
,

{

2
,
3

}



<
2
>













w

4
,

{

1
,
2
,
4

}



<
2
>





w

4
,

{

1
,
3
,
4

}



<
2
>





v

4
,

{

1
,
4

}



<
2
>











w

4
,

{

1
,
4
,
5

}



<
2
>





w

4
,

{

1
,
4
,
6

}



<
2
>





w

4
,

{

2
,
3
,
4

}



<
2
>













w

5
,

{

1
,
2
,
5

}



<
2
>





w

5
,

{

1
,
3
,
5

}



<
2
>





w

5
,

{

1
,
4
,
5

}



<
2
>











v

5
,

{

1
,
5

}



<
2
>





w

5
,

{

1
,
5
,
6

}



<
2
>





w

5
,

{

2
,
3
,
5

}



<
2
>













w

6
,

{

1
,
2
,
6

}



<
2
>





w

6
,

{

1
,
3
,
6

}



<
2
>





w

6
,

{

1
,
4
,
6

}



<
2
>











w

6
,

{

1
,
5
,
6

}



<
2
>





v

6
,

{

1
,
6

}



<
2
>





w

6
,

{

2
,
3
,
6

}



<
2
>


















(
53
)

















{

2
,
4

}








{

2
,
5

}








{

2
,
6

}







{

3
,
4

}








{

3
,
5

}








{

3
,
6

}











w

1
,

{

1
,
2
,
4

}



<
2
>





w

1
,

{

1
,
2
,
5

}



<
2
>





w

1
,

{

1
,
2
,
6

}



<
2
>











w

1
,

{

1
,
3
,
4

}



<
2
>





w

1
,

{

1
,
3
,
5

}



<
2
>





w

1
,

{

1
,
3
,
6

}



<
2
>













v

2
,

{

2
,
4

}



<
2
>





v

2
,

{

2
,
5

}



<
2
>





v

2
,

{

2
,
6

}



<
2
>











w

2
,

{

2
,
3
,
4

}



<
2
>





w

2
,

{

2
,
3
,
5

}



<
2
>





w

2
,

{

2
,
3
,
6

}



<
2
>













w

3
,

{

2
,
3
,
4

}



<
2
>





w

3
,

{

2
,
3
,
5

}



<
2
>





w

3
,

{

2
,
3
,
6

}



<
2
>











v

3
,

{

3
,
4

}



<
2
>





v

3
,

{

3
,
5

}



<
2
>





v

3
,

{

3
,
6

}



<
2
>













v

4
,

{

2
,
4

}



<
2
>





w

4
,

{

2
,
4
,
5

}



<
2
>





w

4
,

{

2
,
4
,
6

}



<
2
>











v

4
,

{

3
,
4

}



<
2
>





w

4
,

{

3
,
4
,
5

}



<
2
>





w

4
,

{

3
,
4
,
6

}



<
2
>













w

5
,

{

2
,
4
,
5

}



<
2
>





v

5
,

{

2
,
5

}



<
2
>





w

5
,

{

2
,
5
,
6

}



<
2
>











w

5
,

{

3
,
4
,
5

}



<
2
>





v

5
,

{

3
,
5

}



<
2
>





w

5
,

{

3
,
5
,
6

}



<
2
>













w

6
,

{

2
,
4
,
6

}



<
2
>





w

6
,

{

2
,
5
,
6

}



<
2
>





v

6
,

{

2
,
6

}



<
2
>











w

6
,

{

3
,
4
,
6

}



<
2
>





w

6
,

{

3
,
5
,
6

}



<
2
>





v

6
,

{

3
,
6

}



<
2
>

























{

4
,
5

}








{

4
,
6

}








{

5
,
6

}











w

1
,

{

1
,
4
,
5

}



<
2
>





w

1
,

{

1
,
4
,
6

}



<
2
>





w

1
,

{

1
,
5
,
6

}



<
2
>







w

2
,

{

2
,
4
,
5

}



<
2
>





w

2
,

{

2
,
4
,
6

}



<
2
>





w

2
,

{

2
,
5
,
6

}



<
2
>







w

3
,

{

3
,
4
,
5

}



<
2
>





w

3
,

{

3
,
4
,
6

}



<
2
>





w

3
,

{

3
,
5
,
6

}



<
2
>







v

4
,

{

4
,
5

}



<
2
>





v

4
,

{

4
,
6

}



<
2
>





w

4
,

{

4
,
5
,
6

}



<
2
>







v

5
,

{

4
,
5

}



<
2
>





w

5
,

{

4
,
5
,
6

}



<
2
>






v

5
,

{

5
,
6

}



<
2
>


=
0






w

6
,

{

4
,
5
,
6

}



<
2
>





v

6
,

{

4
,
6

}



<
2
>






v

6
,

{

5
,
6

}



<
2
>


=
0





]




The modified version of this signed determinant code is given by







T
2

=


T

2



T
0



6


{
6
}




=


T

2



T
0



6


{
6
}




+


Δ


6
,

{
6
}





T
0




.







In summary, the secondary injection of v6,{1,2,5,6}<0> is given by

[T2]5,{1,2}=w5,{1,2,5}(6,{6},T0)+v2,{1,2,5,6}<0>=w5,{1,2,5}<2>+v6,{1,2,5,6}<0>.


It is very important to mention that the child code T2 is initially introduced for the primary injection of symbols

{v6,{1,2,5,6}<0>,v6,{1,2,4,6}<0>,v6,{1,3,4,6}<0>,v6,{2,3,4,6}<0>}⊆custom character(T0),


while it is also used for secondary injection of some other symbols including v6,{1,2,5,6}<0>. For instance, followed from (20), the primary injection of v6,{1,2,3,6}<0> with












x
=
6

,


X
¯

=



{

1
,
2
,
3

}






and






X
¯


=



{
6
}






is





given







by




[

T
0

]


x
,
X



=



v


6
,



{

1
,
2
,
3
,
6

}




0







×


(

-
1

)


1
+


σ

T
0




(
3
)


+



in

d


{

1
,
2
,
3
,
6

}




(
3
)









[

T

2



T
0



6
,

{
6
}




]


3
,

{

1
,
2

}




=



(

-
1

)



σ

T
2




(
3
)





w

3
,

{

1
,
2
,
3

}



(


2
;
6

,

{
6
}

,

T
0


)







,





which is performed into the same child code T2. The matrix representing all the injections into T2 is given in (54). Note that all the primary injections are performed into T2, and the secondary injections take place in T2.


















{

1
,
2

}








{

1
,
3

}








{

1
,
4

}






{

1
,
5

}




{

1
,
6

}












{

2
,
3

}







{

2
,
4

}







{

2
,
5

}



{

2
,
6

}







{

3
,
4

}









Δ

6
,

{
6
}





𝕋
0


=





1






2






3






4






5






6




[



0


0


0


0


0








0


0


0


0


0










0


0


0


0


0








0


0


0


0


0











v

6
,

{

1
,
2
,
3
,
6

}



<
0
>




0


0


0


0








0


0


0


0


0











v

6
,

{

1
,
2
,
4
,
6

}



<
0
>





v

6
,

{

1
,
3
,
4
,
6

}



<
0
>




0


0


0









v

6
,

{

2
,
3
,
4
,
6

}



<
0
>




0


0


0


0











v

6
,

{

1
,
2
,
5
,
6

}



<
0
>





v

6
,

{

1
,
3
,
5
,
6

}



<
0
>





v

6
,

{

1
,
4
,
5
,
6

}



<
0
>




0


0









v

6
,

{

2
,
3
,
5
,
6

}



<
0
>





v

6
,

{

2
,
4
,
5
,
6

}



<
0
>




0


0



v

6
,

{

3
,
4
,
5
,
6

}



<
0
>












0


0


0


0


0








0


0


0


0


0
















(
54
)















{

3
,
5

}




{

3
,
6

}




{

4
,
5

}




{

4
,
6

}




{

5
,
6

}










0


0


0


0


0




0


0


0


0


0




0


0


0


0


0




0


0


0


0


0




0


0


0


0


0




0


0


0


0


0




]




Finally, according to Definition 3, the symbols in T2 can be partitioned to












𝒢
1



(

T
2

)


=

{


v

5
,

{

5
,
6

}




2



,

v

6
,

{

5
,
6

}




2




}


,







𝒢
2



(

T
2

)


=

{





w

5
,

{

1
,
2
,
5

}




2



,

w

5
,

{

1
,
3
,
5

}




2



,

w

5
,

{

1
,
4
,
5

}




2



,

w

5
,

{

2
,
3
,
5

}




2



,

w

5
,

{

2
,
4
,
5

}




2



,

w

5
,

{

3
,
4
,
5

}




2



,

w

6
,

{

1
,
2
,
6

}




2



,

w

6
,

{

1
,
3
,
6

}




2



,







w

6
,

{

1
,
4
,
6

}




2



,

w

6
,

{

1
,
5
,
6

}




2



,

w

6
,

{

2
,
3
,
6

}




2



,

w

6
,

{

2
,
4
,
6

}




2



,

w

6
,

{

2
,
5
,
6

}




2



,

w

6
,

{

3
,
4
,
6

}




2



,

w

6
,

{

3
,
5
,
6

}




2



,

w

6
,

{

4
,
5
,
6

}




2







}


,







𝒢
3



(

T
2

)


=

{


v

5
,

{

1
,
5

}




2



,

v

5
,

{

2
,
5

}




2



,

v

5
,

{

3
,
5

}




2



,

v

5
,

{

4
,
5

}




2



,

v

6
,

{

1
,
6

}




2



,

v

6
,

{

2
,
6

}




2



,

v

6
,

{

3
,
6

}




2



,

v

6
,

{

4
,
6

}




2




}


,







𝒢
4



(

T
2

)


=


{


w

5
,

{

1
,
5
,
6

}




2



,

w

5
,

{

2
,
5
,
6

}




2



,

w

5
,

{

3
,
5
,
6

}




2



,

w

5
,

{

4
,
5
,
6

}




2




}

.






(
55
)







The code segment T2 has 3 child codes, namely, T9, T10, T11. For instance, according to (22), the primary injection of w5,{2,5,6}<2>custom character(T2) with x=5 and custom character={2,6} takes place into a code of mode 0 with injection pair (5, {6}). This child code is called T11 in FIG. 7. Therefore,








[

T
2

]


5
,

{

2
,
6

}



=





(

-
1

)



σ

T
2




(
5
)





w

5
,

{

2
,
5
,
6

}




2








×


(

-
1

)


1
+


σ

T
2




(
2
)


+



in

d


{

2
,
6

}




(
2
)









[

T

11



T
2



5
,

{
6
}




]


2
,
ϕ



=




(

-
1

)



σ

T
11




(
2
)





w

2
,
ϕ


(

5
,

{
6
}

,

T
2


)



=
0.






Recall that for the code segment T11 with mode(T11)=0, all the entries in the message matrix (before injection) are zero, as given in (56). Therefore the modified code is obtained by







T

1

1


=



T

1

1


+

Δ


5


{
6
}





T
2





=


Δ


5


{
6
}





T
2




.







The injection matrix






Δ


5


{
6
}





T
2








is also given in (56).
































𝕋
11

=






1




2




3




4




5




6



[



0




0




0




0




0




0



]





Δ

,

5


{
6
}






𝕋
2


=




1




2




3




4




5




6



[




w

5
,

{

1
,
5
,
6

}



<
2
>







w

5
,

{

2
,
5
,
6

}



<
2
>







w

5
,

{

3
,
5
,
6

}



<
2
>







w

5
,

{

4
,
5
,
6

}



<
2
>






0




0



]








(
56
)







The super-message matrix M may be obtained by concatenating all the code segments, as given in (48). Finally, the code may be obtained by multiplying this super-message matrix by the encoder matrix Ψ, i.e., custom character=Ψ·M, and the i-the row of custom character may be stored in node i.


Consider the data recovery from an arbitrary subset of k=4 nodes, say nodes in custom character={3,5,7,8}. By downloading and stacking the content of nodes in custom character, the techniques described herein obtain Ψ[custom character, :]·M. Here Ψ[custom character, :] is the k×d sub-matrix of Ψ consisting of rows indexed by i∈custom character. Unlike the (n,k=d,d) signed determinant codes, this matrix is not square, and hence is not invertible. So, the techniques described herein cannot simply multiply retrieve M from Ψ[custom character, :]·M. Nevertheless, using properties of the encoder matrix in Definition 2, Ψ[custom character, :] can be decomposed to Ψ[custom character, :]=[Γ[custom character, :]|Υ[custom character, :]], where Γ[custom character, :] is an invertible matrix. Multiplying the Ψ[custom character, :]·M matrix by Γ[custom character, :]−1,















Γ


[

𝒦
,

:


]



-
1


·

(


Ψ
𝒦

·
M

)


=





Γ


[

𝒦
,

:


]



-
1


·

[


Γ


[

𝒦
,

:


]


|

Υ


[

𝒦
,

:


]



]

·
M







=




[


I

k
×
k


|



Γ


[

𝒦
,

:


]



-
1




Υ


[

𝒦
,

:


]




]

·










[





T
0

_





T
1

_








T
14

_







T
0

_





T
1

_








T
14

_




]







=













[



T
0

_

+




Γ


[

𝒦
,

:


]



-
1


·

Υ


[

𝒦
,

:


]










T
0

_















T
1

_

+

Γ








Υ


[

𝒦
,

:


]



-
1


·

Υ


[

𝒦
,

:


]


·






T
1

_























T

1

4


_

+



Γ


[

𝒦
,

:


]



-
1


·

Υ


[

𝒦
,

:


]


·






T
14

_



]




.








(
57
)







The ultimate goal is to recover all entries of M from this equation.


The techniques described herein start with code segments at the lower level of the injection tree in FIG. 7, i.e., code segments with mode zero, namely, T6, . . . , T14. Note that T6=T7= . . . =T14=0, since no injection takes place in the bottom part message matrices of mode m=0 (for instance see






Δ




5
,

{
6




}


T
2







in (56)). Therefore,

Ti+Γ[custom character,:]−1·Υ[custom character,:]·Ti=Ti+Γ[custom character,:]−1·Υ[custom character,:]·0=Ti,


for i=6, . . . ,14, which implies the top parts Ti for i=16, . . . ,14 can be retrieved from (57).


Next, the techniques described herein decode the message matrix of code segments with mode 1. Without loss of generality, consider T5. Recall from (51) and (52) that [T5]5,{5}=[T5]5,{6}=[T5]6,{5}=[T5]6,{6}=0. That is, the bottom parts of columns {5} and {6} of T5 indicated by are zero, and hence, the top parts of these columns can be recovered from (57). Then the techniques described herein can use the parity equation (7) to recover the remaining symbols in T5:w5,{1,5}<5>=w1,{1,5}<5>, w5,{2,5}<5>=w2,{2,5}<5>, etc. Once T5 is fully decoded, the techniques described herein can remove its contribution from T5+Γ[custom character, :]−1·Υ[custom character, :]·T5, and recover entries of T5. Note that T4 can be decoded in a similar fashion.


The next step consists of decoding message matrices of code segments with mode 2, i.e., T1, T2, and T3. Here the techniques described herein focus on decoding T2 for the sake of illustration. First, recall the partitioning of symbols in T2 given in (55), and the fact that symbols in custom character(T2) are zero. Moreover, symbols in custom character(T2)∪custom character(T2) are injected into code segments T9, T10, and T11, and can be recovered from the decoding of the lower mode codes. Therefore, the bottom parts of columns {1,6}, {2,6}, {3,6}, {4,6}, and {5,6} are known. From the bottom part and the collected information for data recovery T2+Γ[custom character, :]−1·Υ[custom character, :]·T2 the techniques described herein can also decode the top part of these columns. The techniques described herein scan the columns of T2 (in (53) and (54)) from right to left, to find the first not-yet-decoded column of T2, that is {4,5}. Note that the entry of T2 at position (6, {4,5}) is given by








[

T
2

]


6
,

{

4
,
5

}



=



[

T

2



T
0



6
,

{
6
}




]


6
,

{

4
,
5

}



=




[

T

2



T
0



6
,

{
6
}




]


6
,

{

4
,
5

}



+


[

Δ



6


{
6
}










T
0




]


6
,

{

4
,
5

}




=




(

-
1

)



σ

T
2




(
6
)





w

6
,

{

4
,
5
,
6

}




2




+

0
.








So, it suffice to find w6,{4,5,6}<2>, which can be obtained from the parity equation







w

6
,

{

3
,
4
,
6

}




2



=



-

w

3
,

{

3
,
4
,
6

}




2




+

w

4
,

{

3
,
4
,
6

}




2




=


-




(

-
1

)



σ

T
2




(
3
)





[

T
2

]



3
,

{

4
,
6

}




+





(

-
1

)



σ

T
2




(
4
)





[

T
2

]



4
,

{

3
,
6

}



.







Note that the latter symbols are located in columns {5,6} and {4,6}, which are already decoded. After decoding [T2]6,{4,5}, the entire column {4,5} can be recovered.


Then the techniques described herein proceed with the recovery of not-yet-decoded columns. A similar procedure may be applied to column {3,5}. However, the decoding of column {3,4} is more challenging, since the techniques described herein have two uncoded symbols, namely,

[T2]5,{3,4}=w5,{3,4,5}<2>+v6,{3,4,5,6}<0>,  (58)
[T2]6,{3,4}=w6,{3,4,6}<2>.  (59)


First note that w6,{3,4,6}<2> can be found from the parity equation









w

6
,

{

3
,
4
,
6

}




<
2
>


=




-

w

3
,

{

3
,
4
,
6

}





<
2
>


+


w

4
,

{

3
,
4
,
6

}




<
2
>



=



-


(

-
1

)


σ

T
2









(
3
)




[

T
2

]


3
,

{

4
,
6

}





+



(

-
1

)


σ

T
2








(
4
)




[

T
2

]


4
,

{

3
,
6

}








,




since columns {4,6} and {3,6} are already decoded. The first term in (58) can be also found in a similar manner.











w

5
,

{

3
,
4
,
5

}



<
2
>


=



-

w

3
,

{

3
,
4
,
5

}



<
2
>



+

w

4
,

{

3
,
4
,
5

}



<
2
>



=


-




(

-
1

)



σ

T
2


(
3
)


[

T
2

]


3
,

{

4
,
5

}




+




(

-
1

)


σ


T
2

(
4
)



[

T
2

]


4
,

{

3
,
5

}






,




(
60
)







where the latter symbols are located in columns {4,5} and {3,5}, and hence already decoded.


In order to find the second term in (58), one may note that the injection of v6,{3,4,5,6}<0> is secondary, and the same symbol is also primarily injected into position (4, {3}) of code segment T5 (see (52)), i.e.,

[T5]4,{3}=w4,{3,4}<5>−v6,{3,4,5,6}<0>.  (61)


Recall that mode(T5)=1<2=mode(T2), and hence T5 is already decoded. Hence,










w

4
,

{

3
,
4

}



<
5
>


=


w

3
,

{

3
,
4

}



<
5
>


=





(

-
1

)


σ


T
5

(
3
)



[

T
5

]


3
,

{
4
}



.






(
62
)







Therefore, the techniques described herein can retrieve w4,{3,4}<5> from (62), and plug it into (61) to obtain v6,{3,4,5,6}<0>. Combining this with (60), the techniques described herein can decode both [T2]5,{3,4} and [T2]6,{3,4}. Upon decoding the bottom part of column {3,4}, its top part can be recovered as before. Repeating a similar procedure the techniques described herein can decode all the columns of T2. Once all three code segments of mode 1, i.e., T1, T2, and T3 are decoded, the techniques described herein can proceed to the recovery of the root code segment T0.


Assume a node f fails, and its codeword needs to be repaired using the repair data received from a set of helper nodes, say custom character with |custom character|=d. The repair data sent by each helper node is simply the collection of the repair data sent for each code segment. Here each helper node h∈custom character sends ∪i=014h·Ti·Ξf,(mi)} to node f, where mi is the mode of code Ti, e.g., m0=4.


The content of node f may be reconstructed segment by segment. In contrast to the data recovery, the repair process is a top-to-bottom process. The process starts from the codeword segment (corresponding to the code segment with mode μ located at the root of the injection tree) of the failed node f. Note that no symbol is injected into T0=T0, and hence the repair of the first α(4)=15 symbols of node f is identical to that of a signed determinant code, as described in Proposition 2.


Once the segment corresponding to T0 is repaired, the techniques described herein can proceed with the codeword segments corresponding to the child codes in the injection tree.


Let us focus on the repair of the symbol at position {2,3} of the codeword segment corresponding to T2, i.e.,











[


Ψ
f

·

T
2


]


(

2
,
3

)


=



ψ

f
,
1




w

1
,

{

1
,
2
,
3

}





2


.



+


ψ

f
,
2




v

2
,

{

2
,
3

}





2


.



+


ψ

f
,
3




v

3
,

{

2
,
3

}





2


.



+


ψ

f
,
4




w

4
,

{

2
,
3
,
4

}





2


.



+

v

3
,

{

2
,
3
,
4
,
6

}





0


.


+


ψ

f
,
5




(


w

5
,

{

2
,
3
,
5

}





2


.


+

v

6
,

{

2
,
3
,
5
,
6

}





0


.



)


+


ψ

f
,
6




w

6
,

{

2
,
3
,
6

}





2


.








(
63
)







Upon receiving the repair symbols, node f recovers Rf(T2)=T2·Ξf,(2), where Ξf,(2) is defined in (9). Following (26) for custom character={2,3},













x


{

2
,
3

}








(

-
1

)




σ

T
2




(
x
)


+

i

n



d

{

2
,
3

}




(
x
)







[


R
f



(

T
2

)


]



x
,


{

2
,
3

}



{
x
}





=



-


[


R
f



(

T
2

)


]


2
,

{
3
}




+


[


R
f



(

T
2

)


]


3
,

{
2
}




=



-












[
6
]








=
2











[

T
2

]


2
,





Ξ


,

{
3
}



f
,

(
2
)






+








[
6
]








=
2











[

T
2

]


3
,





Ξ


,

{
2
}



f
,

(
2
)






=



-




y


{

1
,
2
,
4
,
5
,
6

}







[

T
2

]


2
,

{

y
,
3

}





Ξ


{

y
,
3

}

,

{
3
}



f
,

(
2
)






+




y


{

1
,
3
,
4
,
5
,
6

}







[

T
2

]


3
,

{

y
,
2

}





Ξ


{

y
,
2

}

,

{
2
}



f
,

(
2
)






=



-

(



-

ψ

f
,
1





w

2
,

{

1
,
2
,
3

}




2




-


ψ

f
,
2




v

2
,

{

2
,
3

}




2




+


ψ

f
,
4




w

2
,

{

2
,
3
,
4

}




2




+


ψ

f
,
5




w

2
,

{

2
,
3
,
5

}




2




+


ψ

f
,
6




w

2
,

{

2
,
3
,
6

}




2





)


+

(


-


ψ

f
,
1




(


w

3
,

{

1
,
2
,
3

}




2



+

v

6
,

{

1
,
2
,
3
,
6

}




0




)



+


ψ

f
,
3




v

3
,

{

2
,
3

}




2




+


ψ

f
,
4




w

3
,

{

2
,
3
,
4

}




2




+


ψ

f
,
5




w

3
,

{

2
,
3
,
5

}




2




+


ψ

f
,
6




w

3
,

{

2
,
3
,
6

}




2





)


=



ψ

f
,
1




w

1
,

{

1
,
2
,
3

}




2




+


ψ

f
,
2




v

2
,

{

2
,
3

}




2




+


ψ

f
,
3




v

3
,

{

2
,
3

}




2




+


ψ

f
,
4




w

4
,

{

2
,
3
,
4

}




2




+


ψ

f
,
5




w

5
,

{

2
,
3
,
4

}




2




+


ψ

f
,
6




w

6
,

{

2
,
3
,
4

}




2




-


ψ

f
,
1




v

6
,

{

1
,
2
,
3
,
6

}




0













(
64
)







Comparing (63) and (64),












[


Ψ
f





.





T
2


]


{

2
,
3

}


-




x


{

2
,
3

}








(

-
1

)



+


σ

T
2




(
x
)



+

i

n



d

{

2
,
3

}




(
x
)







[


R
f



(

T
2

)


]



x
,


{

2
,
3

}



{
x
}






=



ψ

f
,
1




v

6
,

{

1
,
2
,
3
,
6

}




0




+


ψ

f
,
4




v

6
,

{

2
,
3
,
4
,
6

}




0




+


ψ

f
,
5





v

6
,

{

2
,
3
,
5
,
6

}




0



.







(
65
)







That means, the code segment T2 cannot be individually repaired. However, all the three terms in the difference above are symbols of the code segment T0. Interestingly, the repair space of the parent code T0 at position (b, custom charactercustom character)=(6, {2,3,6}) includes










-


[


R
f



(

T
0

)


]


6
,

{

2
,
3
,
6

}








=


-


[


R
f



(

T
0

)


]


6
,


{

2
,
3

}



{
6
}





=


-












[
6
]








=
4











[

T
0

]


6
,





Ξ


,

{

2
,
3
,
6

}



f
,

(
4
)






=


-




y


{

1
,
4
,
5

}







[

T
0

]


6
,

{

y
,
2
,
3
,
6

}





Ξ


{

y
,
2
,
3
,
6

}

,

{

2
,
3
,
6

}



f
,

(
4
)






=



ψ

f
,
1




v

6
,

{

1
,
2
,
3
,
6

}




4




+


ψ

f
,
4




v

6
,

{

2
,
3
,
4
,
6

}




4




+


ψ

f
,
5





v

6
,

{

2
,
3
,
5
,
6

}




4



.










(
66
)







Therefore, summing up (64) and (66), the techniques described herein can exactly recover the missing symbol in (63). A similar procedure can be used to repair all the other symbols and codeword segments of the failed node f.


Consider an arbitrary subset of nodes custom character⊆[n] with |custom character|=k. Let Ψ[custom character, :] denote the k×d sub-matrix formed by collecting corresponding k rows of Ψ. The collection of coded symbols stored in these k nodes can be written as Ψ[custom character, :]·M. The goal is to recover the original message super-matrix M from the observed data Ψ[custom character, :]·M.


The message super-matrix M consists of several code segments such as T, each operating at a mode m=mode(T). The message matrix T of a code segment operating at mode m is a d×αm matrix and can be partitioned into T and T, corresponding to the top k and bottom (d−k) rows of T, respectively. The data recovery is a bottom-to-top process. More precisely, the techniques described herein start by recovering the data symbols in the code segments at the lowest level of the injection tree (with the smallest mode) and proceed to the parent code segments, until the techniques described herein reach to the code segment at the root of the injection tree.


The message matrix T is a combination of an original signed determinant code T and the symbols injected from the parent code of T, namely P. More precisely,







T




b
,



P


=


T




b
,



P


+


Δ




b
,



P


.







Decoding this segment means decoding the entries of T, i.e., the combination of the original symbols and the injected symbols. The following lemma indicates the injected symbols can be easily separated from the original symbols.


Lemma 3 If a code segment






T
=


T




b
,



P


=


T




b
,



P


+

Δ




b
,



P









is decoded (all entries of T are retrieved), then the symbols of the original signed determinant code






T


b
,




P







and the injected symbols






Δ


b
,




P







can be uniquely extracted.


Proof of Lemma˜\reflm:decode:separate: Recall that there is redundancy among the entries of T. The claim on this lemma follows from the fact that there is (at most) one injection into each parity group. To see that, recall (19), which implies an injection into position (x, custom character) of T takes place only if x>maxcustom character. That means custom character may be last w-symbol in the parity group {custom character: y∈custom character∪{x}}. More precisely, for a position (x, custom character), with x≤maxcustom character,









[

Δ


b
,




P



]


x
,
X


=
0

,



[

T


b
,




P



]


x
,
X


=



[

T


b
,




P



]


x
,
X


.






Otherwise, if x>maxcustom character,








[

T



b
,




















P




]


x
,
χ


=




(

-
1

)



σ
T



(
x
)





w

x
,

χ


{
x
}





=




(

-
1

)




σ
T



(
x
)


+


χ









y

χ






(

-
1

)



ind
χ



(
y
)





w

y
,

χ


{
x
}







=



(

-
1

)




σ
T



(
x
)


+


χ









y

χ








(

-
1

)




σ
T



(
y
)


+


ind
χ



(
y
)




[

T



b
,




















P




]


y
,


(

χ


{
x
}


)


\


{
y
}




.









Finally, once







[

T



b
,




















P




]


x
,
χ






is recovered, the techniques described herein can find the injected symbol from








[

Δ



b
,




















P




]


x
,
χ


=



[

T



b
,




















P




]


x
,
χ


-



[

T



b
,




















P




]


x
,
χ


.






Now consider an arbitrary code segment T. Note that the coded symbols in the codeword segment corresponding to T can be represented as












Ψ


[

,
:

]


·
T

=


[


Γ


[

,
:

]




Υ


[

,
:

]



]



[




T
_






T
_




]



,




(
67
)







where Γ[custom character, :] is an invertible matrix. Multiplying both sides of (67) by Γ[custom character, :]−1,












Γ


[

,
:

]



-
1


·

Ψ


[

,
:

]


·
T

=



[

I




Γ


[

,
:

]



-
1


·

Υ


[

,
:

]




]



[




T
_






T
_




]


=


T
_

+



Γ


[

,
:

]



-
1


·

Υ


[

,
:

]


·


T
_

.








(
68
)







Note that (68) holds for every column of T. More precisely, for a column of T labeled by custom character, the data collector observes

custom character+Γ[custom character,:]−1·Υ[custom character,:]·custom character,  (69)


where custom character is the k×1 column and custom character is a (d−k)×1 column. Therefore, for each column custom character upon decoding custom character, the techniques described herein can compute the term Γ[custom character, :]−1·Υ[custom character, :]·custom character, and cancel it from (69) to obtain custom character, and fully decode custom character.












Algorithm 1: Construction of Cascade Codes Super-Message Matrix


Input: Parameters k, d and μ


Output: Super-Message Matrix (M) of a cascade code operating at mode μ
















1:

custom character ← SGNDETCODEMSGMAT (d, μ, o1xd); add custom character to UnivistedNodeCollection;






2:






Δ
0



O

dx


(



d




μ



)




;


T
0




𝕋
0

+

Δ
0



;

M


T
0


;









3:
while there exists a node custom character in UnvisitedNodeCollection do


4:
 | for each custom character ⊆ |k + 1: d| with |custom character | < mode (custom character ) do


5:
 | | m ← mode (custom character ) − |custom character | − 1;


6:
 | | for each x ∈ |k + 1 : d| with x ≤ max custom character do


7:
 | | | for i ← 1 to d do custom character (i) ← 1 + custom character (i) + custom character (i);


8:
 | | | custom character ← SGNDETCODEMSGMAT(k, d, m, custom character );


9:
 | | | custom character ← INJECTIONMAT( d, m, x, custom character ,custom character );


10:
 | | | custom charactercustom character + custom character ;


11:
 | | | M ← [M|custom character ]; add custom character to UnvisitedNodeCollection;


12:
 | | end


13
 | end


14
 | remove custom character from UnvisitedNodeCollection;


15:
end


16:
return M;


17:
Procedure SGNDETCODEMSGMAT ( k, d, m, σD):


18:
 | for x ← 1 to d do


19:
 | | for each I ⊆ [d] with [I] = m do


20:
 | | if x ∈ [k + 1 : d] and I ∩ [k] = ϕ then custom character ← 0;


21:
 | | else If x ∈ I then custom character ← (−1) custom character


22:
 | | else custom character ← (−1)custom character


23:
 | | end


24:
 | end


25:
return custom character


26:
Procedure INJECTIONMAT (d, m, x, custom character ):


27:
 | for x ← 1 to d do


28:
 | | for each I ⊆ [d] with |I| = m do


29:
 | | | if i > maxI, i ∉ custom character , I ∩ B = ϕ thenΔt,l ← (−1)custom character


30:
 | | else Δt,J ← 0;


31:
  end


32:
 end


33:
return Δ;









The general decoding procedure includes the following stages:


1. Recursive decoding across code segments, from segments with mode 0 to those with higher modes, until the techniques described herein reach the root of the tree;


2. Within each code segment, the techniques described herein decode the columns recursively, according to the reverse lexicographical order; For each column labeled by custom character:


(a) The techniques described herein first decode the entries at the bottom part of column custom character belonging to custom character(custom character) to zero.


(b) The entries of the bottom part of column custom character that belong to either custom character(custom character) or custom character(custom character) may be decoded using the child codes (who have a lower mode, and hence already decoded).


(c) Then the techniques described herein decode the entries of the bottom part of column custom character belonging to group custom character(custom character), using the child codes and the parity equation. To this end, the techniques described herein need to use the symbols in columns custom character where custom charactercustom charactercustom character.


(d) Once the lower part of column custom character is decoded, the techniques described herein can recover its upper by canceling custom character from the collected data in (69).


The techniques described herein start with the following lemma, which guarantees that the decoding procedure explained above can start.


Lemma 4: There is no injection to the bottom part of any code segment of mode 0.


Proof of Lemma˜\reflm:bottom-mode0: Consider a code with message matrix T with mode(T)=0, which is introduced by its parent P via injection pair (b, custom character). Note that the only column of T is indexed by ø. According to (19), a possible injection to position (x, ø) of T may be







P

b
,




{
x
}






=

P

b
,


{
x
}











(up to a sign). For a position in the lower part of T, the techniques described herein have x∈[k+1: d]. This together with custom character⊆[k+1: d] implies that






P

b
,


{
x
}










custom character(P) (see Definition 3), and hence, this symbol is set to zero, and not injected.


Based on this lemma, the stage (i) of decoding can be started, the techniques described herein recover symbols in the message matrices of codes with mode 0. Next, consider decoding a code segment T with mode(T)=m. Assume that all the message matrices with mode less than m, including all the child codes of T are already decoded. Stage (ii) of the decoding procedure can be performed as follows.



custom charactercustom character(T): Recall that symbols in the first group are set to zero. Hence, no recovery is needed.



custom charactercustom character(T)Åcustom character(T): For a symbol custom character in groups custom character or custom character the techniques described herein have


1. no injection is performed to position (x, custom character) of T.


2. custom character is injected into a chide code Q of T, and hence, it can be recovered from the decoded child code.


To see (1), recall the injection equation (19), which implies a host position (x, custom character) may satisfy x>maxcustom character, which is in contradiction with x∈custom character for custom character and with x<maxcustom character for custom character. Moreover, (2) is followed from Lemma 3.


The last column of the code segment message matrix T of mode m is labeled by






𝒳
=


[

d
-
m
+

1


:


d


]

.






If a symbol custom character in the lower (d−k) rows of this column belongs to group custom character(T), then it may satisfy x>maxcustom character=d, which is not possible. This argument shows that the last column has no element from custom character, and hence is fully decoded as the only possible entries are from custom character, custom character or custom character.


Consider an arbitrary column custom character⊆[d] with |custom character|=m, and an element custom charactercustom character2, i.e., x∈[k+1: d] and x>maxcustom character. The goal is to decode custom character. First, note that position (x, custom character) may be hosting a (secondary) injection from the parent code P. According to (19)











T

x
,
χ


=



[

T



b
,




















P




]


x
,
χ


=




[

T



b
,




















P




]


x
,
χ


+

Δ



b
,




















P





=





(

-
1

)



σ
T



(
x
)





w

x
,

χ


{
x
}





+



(

-
1

)


1
+


σ
P



(
x
)


+

ind

χ


{
x
}






(
x
)








P

b
,

χ


{
x
}







x








,


χ



=






(
70
)







So, in order to decode custom character the techniques described herein need to find both custom character and the injected symbol custom character. However, note that this injection is a secondary one for custom character, since the injection position is in the lower part of the child code. This symbol is also primarily injected into another child code Q with mode satisfying










mode


(
Q
)


=





χ


{
x
}




_



-
1





(
71
)






=





χ
_



+




{
x
}

_



+




_



-
1

=




χ
_



-
1






(
72
)







<



χ
_



<


χ



=

mode


(
T
)






(
73
)







where (71) follows the primary injections in (20) and (22), (72) follows the facts that x∈[k+1: d] and custom character⊆[k+1: d], and (73) holds since columns of T are indexed by subsets of size |custom character|. Therefore, (73) immediately implies the child code hosting the primary injection of custom character has a lower mode, and hence is already decoded before the techniques described herein decode T. Therefore, the injected symbol custom character is known.


The second term in (70) can be decoded using the parity equation (7). Note that, custom character satisfies a parity equation along with custom character. Each w-symbol custom character is located in column custom character=(custom character∪{x})\{y}, i.e., y in custom character is replaced by x in custom character. Since x>y, the lexicographical order between custom character and custom character satisfies custom charactercustom charactercustom character. Therefore, due to the recursive procedure, every symbol in column custom character including custom character is already decoded. This allows us to retrieve custom character from







T

x
,
χ


=



(

-
1

)




σ
T



(
x
)


+


χ


+
1







y

χ






(

-
1

)




σ
T



(
y
)


+


ind

χ


{
x
}





(
y
)







T

x
,


(

χ


{
x
}


)


\


{
y
}




.








Once both terms in (70) are found, the techniques described herein can decode custom character. A similar decoding rule can be applied to each symbol in the lower part of column custom character that belongs to custom character(T). Once all symbols in custom character are decoded, the techniques described herein can use (69) to obtain custom character. This may complete the recovery of column custom character. Then the techniques described herein can proceed with the next column of T until the message matrix T is fully decoded.












Algorithm 2: Data Recovery algorithm


Input: Stacked contents of custom character nodes custom character ⊆ [n] in the form of Ψ [custom character ,:], M.


Output : Recovered data file (entries of the super message matrix M).
















1:
M← [ ];


2:
for m ← 0 to μ do


3:
 | for each Ψ [custom character :].custom character in Ψ[custom character :] · M with mode (custom character ) = m do custom character Loop 1


4:
 | | custom character ← DECODESEGMENT(Ψ[custom character :] · custom character ): custom character Loop 2


5:
 | | M ← [custom character |M];


6:
 | | custom character , custom character ← GETDELTAS(custom character ); custom character Globally store custom character and custom character


7:
 | end


8:
end


9:
return M ;


10:
Procedure DECODESEGMENT(Ψ [custom character :] · custom character ):


11:
 | for each I ⊆ [d] with [I] = m in the reverse lexicographical order do


12:
 | | [S]:,I ← DECODECOLUMNBOTTOM (Ψ[custom character :] · custom character ,I);


13:
 | |  S:,I ← r[custom character :]−1 · [Ψ [custom character :] · S]:,I... custom character Decode upper part using (73)



 | | ...−r[custom character :]−1 · Ψ [custom character :] · [S]:,I;


14:
 | end





15:
 | 
s(y,𝓎)[s_s_]






16:
return custom character


17:
Procedure DECODECOLUMNBOTTOM (Ψ [custom character :] · custom character ,I):


18:
 |  custom character ← I ∩ [k]; custom character ← I ∩ [k + 1 : d];


19:
 | for x ← k + 1 to d do


20:
 | | if x ≤ max custom character and custom character ≠ ϕ then    custom character Symbol belonging to custom character


21:
 | | | [s]x,I ← (−1)custom character . . . custom character Get



 | | | injected symbol using (17)


22:
 | | else if x ≤ max custom character and custom character = ϕ then    custom character Symbol belonging to custom character


23:
 | | | [s]x,I ← 0


24:
 | | else if x ≤ max custom character then    custom character Symbol belonging to custom character


25:
 | | | [s]x,I ← (−1)custom charactercustom character


26:
 | | | if x ∉ custom character and I ∩ custom character = ϕ and m > 0 then custom character A symbol is injected into [s]x,I


27:
 | | | |  B = B ∪ {x} ∪ custom charactercustom character ← . . . custom character Get injected symbol



 | | | | . . .custom charactercustom character


28:
 | | | | custom charactercustom character



 | | | | . . .+ custom character ;


29:
 | | | end


30:
 | | end


31:
 | end


32:
return [S];,I


36:
Procedure GETDELTAS custom character :


37:
 | m ← mode (custom character );


38:
 | for i ← 1 to d do


39:
 | | for I ⊆ [d + 1] with |I| = m do


40:
 | | | if i > max X and i ∈ custom character and I ∩ Y = ϕ then


41:
 | | | | custom charactercustom charactercustom character


42:
 | | | else


43:
 | | | | custom character


44:
 | | | end


45:
 | | | custom character ;


46:
 | | end


47:
 | end


48:
returncustom character









where Rf(Q) is the repair space of the code segment Q, and Rf(P) is the repair space of its parent code segment.


Note that for positions custom character that have an overlap with custom character (the second injection parameter), the repair identity above reduces to (11), and the repair may be performed as in an ordinary signed determinant code. For positions custom character which are disjoint from custom character, the original repair equation has an interference caused by the injected symbols. However, this interference can be canceled using the repair space of the parent code.


In this section, the techniques described herein may show that upon failure of any node, its content can be exactly reconstructed by accessing d helper nodes and downloading β(d, k; μ) symbols from each. More formally, the techniques described herein prove that for any failed node f∈[n], any symbol at position custom character of any code segment






Q



b
,




















P




(
m
)






of node f can be repaired using identity (26)











[


Ψ
f

·

Q



b
,




P



]

𝒥

=





x

𝒥






(

-
1

)




σ
Q

(
x
)

+


ind
𝒥

(
x
)



[


R
f

(
Q
)

]


x


𝒥

\


{
x
}





-

(





[


R
f

(
P
)

]


b
,

𝒥











if


𝒥




=


,





0



otherwise
,
















The formal proof of identity (26) is given in the following.


Proof of Proposition˜\refprop:nkd:repair. Let node f fails, and its content needs to be repaired using the repair data received from the helper nodes in custom character with |custom character|=d. The content of node f may be reconstructed segment by segment.


Consider a code segment







Q
=

Q


b
,




P




,





that is a determinant code with mode m and injection pair (b, custom character), into which symbols from its parent code P with mode(P)=j are injected. Recall that the corresponding code segment matrix can be written as








Q


b
,





P

=



Q


b
,





P

+


Δ


b
,






P








where the first term is a signed determinant code and the second term indicates the contribution of injection. For a given position custom character within this codeword segment, with custom character⊆[d] with |custom character|=m, the corresponding symbol of the failed node is given by








[


Ψ
f

·

Q



b
,




P



]

𝒥

=



[


Ψ
f

·

Q



b
,




P



]

𝒥

+



[



Ψ
f

·

Δ



b
,




P



,

]

𝒥

.






As it is clear from (26), the repair of segment Q of the codeword of f may be performed similar to that of the determinant codes using the repair space Rf(Q)=Q·Ξf,(m), together with a correction using the repair space of the repair of the parent code, that is Rf(P)=P·Ξf,(j). Note that the latter correction may take care of the deviation of the code segment from the standard determinant code which is caused by injection of the symbols from the parent code P.


The techniques described herein start with the first term in the right-hand-side of (26), which is
















x

𝒥






(

-
1

)




σ
Q

(
x
)

+

i

n



d
𝒥

(
x
)




[


R
f

(
Q
)

]


x


𝒥


{
x
}





=





x

𝒥






(

-
1

)




σ
Q

(
x
)

+


ind
𝒥

(
x
)



[


Q



b
,




P




Ξ

f
,

(
m
)




]


x

,

𝒥

\


{
x
}











=






x

𝒥






(

-
1

)




σ
Q

(
x
)

+


ind
𝒥

(
x
)



[


Q



b
,




P




Ξ

f
,

(
m
)




]


x

,

𝒥

\


{
x
}





+












x

𝒥






(

-
1

)




σ
Q

(
x
)

+


ind
𝒥

(
x
)



[


Δ



b
,




P




Ξ

f
,

(
m
)




]


x

,

𝒥

\


{
x
}












(
74
)














=



[


Ψ
f

·

Q



b
,




P



]

𝒥

+




x

𝒥






(

-
1

)




σ
Q

(
x
)

+


ind
𝒥

(
x
)



[


Δ



b
,




P




Ξ

f
·

(
m
)




]


x

,

𝒥

\


{
x
}











(
75
)







where (74) holds due to the linearity of the operations, and (75) used Proposition 2 for the repair of






Q


b
,




P







which is a (d; m) signed determinant code with signature vector σQ. Therefore, proving the claimed identity reduces to show












Term
1

-

Term
2


=

Term
3


,




(
76
)














where



Term
1


=


[


Ψ
f



Δ


b
,




P




]

𝒥






Term
2

=




x

𝒥






(

-
1

)




σ
Q

(
x
)

+


ind
𝒥

(
x
)



[


Δ


b
,




P





Ξ

f
,

(
m
)




]


x

,

𝒥

\


{
x
}










Term
3

=


-


[


R
f

(
P
)

]


b
,


𝒥







1


{


𝒥



=


}







(
77
)







Note that all the data symbols appearing in (77) belong the parent code segment matrix P. The techniques described herein can distinguish the following two cases in order to prove (76).


Case I: custom charactercustom character=ø: Starting by Term1









Term
1

=



[


Ψ
f



Δ


b
,




P




]

𝒥

=




y


[
d
]






ψ

f
,
y


[

Δ


b
,




P



]


y
,
𝒥














=




y



[



max

𝒥

+
1

:
d

]


\








ψ

f
,
y


[

Δ


b
,




P



]


y
,
𝒥







(
78
)












=




y



[



max

𝒥

+
1

:
d

]


\




(

𝒥



)







ψ

f
,
y


[

Δ


b
,




P



]


y
,
𝒥







(
79
)













=




y



[



max

𝒥

+
1

:
d

]



\(


𝒥



)








(

-
1

)


1
+


σ
P



{
y
}


+


ind

𝒥


{
y
}





(
y
)





ψ

f
,
y




P

b
,

𝒥


{
y
}









,




(
80
)







where (78) follows the definition of the injection symbols in (19) which implies a non-zero injection occurs at position (y, custom character) only if y>maxcustom character and y∉custom character; (79) holds since [maxcustom character+1: d]∩custom character=ø; and in (80) the techniques described herein plugged in the entries of






Δ


b
,




P







from (19).


Next, for Term2









Term
2

=




x

𝒥






(

-
1

)




σ
Q

(
x
)

+


ind
𝒥

(
x
)



[


Δ


b
,




P





Ξ

f
,

(
m
)




]


x

,

𝒥

\


{
x
}














=




x

𝒥





(

-
1

)




σ
Q

(
x
)

+


ind
𝒥

(
x
)














[
d
]










"\[LeftBracketingBar]"




"\[RightBracketingBar]"


=
m










[

Δ


b
,




P



]


x

,



[

Ξ

f
,

(
m
)



]



,


𝒥

\


{
x
}

















=




x

𝒥





(

-
1

)




σ
Q

(
x
)

+


ind
𝒥

(
x
)











y


[
d
]







y


𝒥

\


{
x
}











[

Δ


b
,




P



]


x

,


(

𝒥

\


{
x
}


)



{
y
}




[

Ξ

f
,

(
m
)



]




(

𝒥

\


{
x
}


)



{
y
}


,

𝒥

\


{
x
}











(
81
)












=




x
=

max

𝒥






(

-
1

)




σ
Q

(
x
)

+


ind
𝒥

(
x
)











y
<
x






y


(

𝒥



)










[

Δ


b
,




P



]


x

,


(

𝒥

\


{
x
}


)



{
y
}




[

Ξ

f
,

(
m
)



]




(

𝒥

\


{
x
}


)



{
y
}


,

𝒥

\


{
x
}











(
82
)












=



(

-
1

)




σ
Q

(

max

𝒥

)

+


ind
𝒥

(

max

𝒥

)








y



[

max

𝒥

]



\(


𝒥



)









[

Δ


b
,




P



]



max

𝒥

,

𝒥



{
y
}



\(


max

𝒥

)






[

Ξ

f
,

(
m
)



]



𝒥



{
y
}


\


{

max

𝒥

}



,

𝒥

\


{

max

𝒥

}










(
83
)












=



(

-
1

)




σ
Q

(

max

𝒥

)

+


ind
𝒥

(

max

𝒥

)








y



[

max

𝒥

]



\(


𝒥



)







{


[




(

-
1

)


1
+


σ
P

(

max

𝒥

)

+


ind

𝒥


{
y
}





(

max

𝒥

)






P

b
,

𝒥


{
y
}







]

·


[




(

-
1

)




σ
Q

(
y
)

+


ind


(

𝒥

\


{

max

𝒥

}


)



{
y
}



(
y
)





ψ

f
,
y



]


}

.







(
84
)







Note that in (81) the techniques described herein have used the definition of matrix Ξf,(m) in (9) that implies the entry in position (custom character, custom character\{x}) is non-zero only if custom character=(custom character\{x})∪{y} for some y∈custom character\{x}. Moreover, (82) follows from the definition of injected entries in (19), which implies the entry of






Δ


b
,




P







at position (x, (custom character\{x})∪{y}) is non-zero only if all the following conditions hold:







{




x
>


max


(


\


{
x
}


)




{
y
}









y
<




xandx




>

max


(



{
x
}


)








,








(


(



{
x
}


)



{
y
}


)




=








y




,










which together imply x=maxcustom character and y∈[maxcustom character]\(custom charactercustom character). In (84) the matrix entries are replaced from their definitions in(9) and (19).


Next note that the overall sign in (84) can be simplified as follows. First,












σ
Q



(
)


+



(
)


+


σ
P



(
)


+



(
)



=


[

1
+


σ
P



(
)


+



(
)



]

+



(
)


+


σ
P



(
)


+



(
)







(
85
)









1
+



(
)


+



(
)


+



(
)











(

mod





2

)

=

1
+



{

u







{
}



:






u





}



+



{

u





:






u




}



+



{

u





{
y
}






:






u





}









(
86
)









=

1
+

[




{

u






:






u




}



+
1

]

+



{

u





:






u




}



+


[




{

u








:






u





}






+
1

]




(
87
)











=

3
+

2




{

u








:






u





}










(
88
)














1






(

mod





2

)



,





(
89
)







where (85) is due to the definition of the child code's signature in (17); (86) plugged in the definition of ind·(⋅) from (2); equality in (87) holds since y<maxcustom character=x; and (88) is due the fact that sets custom character and custom character are disjoint. Similarly,












σ
Q



(
y
)


+



(
y
)



=


[





1
+


σ
P



(
y
)


+


ind




{
y
}





(
y
)



]

+



(
y
)







(
90
)






=

1
+


σ
P



(
y
)


+



{


u





{
y
}



:

u

y


}



+



{

u




(


\


{
}


)




{
y
}



:






u



y


}








(
91
)











=

1
+


σ
P



(
y
)


+



{


u





{
y
}



:

u

y


}



+



{

u






{
y
}



:






u



y


}









(
92
)











=

1
+


σ
P



(
y
)


+




{

u





{
y
}






:






u



y


}

+
1









(
93
)
















σ
P



(
y
)


+



(
y
)







(

mod





2

)




,





(
94
)







where (90) used the definition of the child code's signature in (17); (91) and (94) plugged in the definition of ind·(⋅) given in (2); equality in (92) follows the fact that y<maxcustom character; and (93) holds since custom character and custom character are disjoint sets.


Plugging (89) and (94) into (84)










Term
2

=

-




y



[

max

𝒥

]



\(


𝒥



)








(

-
1

)


1
+


σ
P

(
y
)

+


ind

𝒥


{
y
}





(
y
)





ψ

f
,
y





P

b
,

𝒥


{
y
}






.








(
95
)









Lastly
,










-

Term
3


=


-


[


R
f

(
P
)

]


b
,

𝒥






=

-


[

P
·


Ξ



f
,

(
j
)




]


b
,

𝒥











(
96
)












=

-







[
d
]






P

b
,



·


[

Ξ

f
,

(
j
)



]



,

𝒥












(
97
)












=

-




y



[
d
]



\(


𝒥



)







P

b
,

𝒥


{
y
}






·


[

Ξ

f
,

(
j
)



]



𝒥


{
y
}




,

𝒥












(
98
)












=




y



[
d
]



\(


𝒥



)








(

-
1

)


1
+


σ
P

(
y
)

+


ind

𝒥


{
y
}





(
y
)





ψ

f
,
y




P

b
,

𝒥


{
y
}












(
99
)












=




y



[
d
]



\(


𝒥



)








(

-
1

)


1
+


σ
P

(
y
)

+


ind

𝒥


{
y
}





(
y
)





ψ

f
,
y




P

b
,

𝒥


{
y
}












(
100
)







where (100) holds since an injection at position (b, custom character∪{y}∪custom character) of matrix P may be performed only if b>max{custom character∪{y}∪custom character} which is in contradiction with b≤max custom character, which is a required condition for the injection pair (b, custom character) as specified. Therefore, from (80) and (95)









Term
1

-

Term
2


=






y



[



max

𝒥

+
1

:
d

]



\(


𝒥



)








(

-
1

)


1
+


σ
P

(
y
)

+


ind

𝒥


{
y
}





(
y
)





ψ

f
,
y




P

b
,

𝒥


{
y
}








+




y



[

max

𝒥

]



\(


𝒥



)








(

-
1

)


1
+


σ
P

(
y
)

+


ind

𝒥


{
y
}





(
y
)





ψ

f
,
y




P

b
,

𝒥


{
y
}









=





y



[
d
]



\(


𝒥



)








(

-
1

)


1
+


σ
P

(
y
)

+


ind

𝒥


{
y
}





(
y
)





ψ

f
,
y




P

b
,

𝒥


{
y
}








=

Term
3




,




which is the desired identity in (76).


Case II: custom charactercustom character≠ø: First note that each term in (78) consists of at entry from






Δ


b
,




P







a position (y, custom character), which is zero for custom charactercustom character≠ø. Therefore the techniques described herein have Term1=0.


Similarly, the entry of






Δ


b
,




P







at position (x, (custom character\{x})∪{y}) is non-zero only













if






(


(


\


{
x
}


)



{
y
}


)





=


0





and





x





.




This






implies









0
=






(


(


\


{
x
}


)



{
y
}


)











(


\


{
x
}


)
















-




{
x
}








=










1.






This contradiction implies that all the terms in (81) are zero, and hence Term2=0. Finally, Term3 is zero by its definition. Therefore, the identity (76) clearly holds. This completes the proof.


In this section, the techniques described herein compare the code construction proposed in this disclosure with the product-matrix (PM) code. Since both code constructions are linear, they both can be written as a product of encoder matrix and a message matrix, i.e., custom character=Ψ·M. However, there are several important differences between the two constructions. Some of the main distinctions between the two code construction are highlighted in Table 2.









TABLE 2







A comparison between the product-matrix and cascade codes










Product-Matrix Code
The techniques described herein







Only for extreme points
For the entire storage - bandwidth



MBR and MSR
trade-off



MSR code construction:
MSR code construction:



only for d ≥ 2k − 1
all parameter sets



MSR point parameters:
MSR point parameters:



(α, β, F) = (d − k + 1, 1,
(α, β, F) = ((d − k + 1)k,



k(d − k + 1))
(d − k + 1)k−1, k(d − k + 1)k)



Different structure and
2*Universal code construction for



requirement for the
the entire trade-off



encoder matrix of MBR



and MSR codes



Repair data encoders are
A systematic repair encoder Ξ



rows of the main code
for each point on the trade-off



encoder










The MSR point is fundamentally different. To show this, the techniques described herein can rely on the standard notion of similarity is a well-defined criterion to check if two codes are convertible to each other. Two linear codes custom character and custom character′ are equivalent if custom character′ can be represented in terms of custom character by


1. a change of basis of the vector space generated by the message symbols (i.e., a remapping of the message symbols), and


2. a change of basis of the column-spaces of the nodal generator matrices (i.e., a remapping of the symbols stored within a node).


3. scale parameters of (α, β, F) of codes by an integer factor so that both codes have the same parameters.


To show that the two codes are not similar, the techniques described herein focus on a special case, namely an MSR code with for an (n, k, d=2k−2) distributed storage system for k>2. For such a system, the parameters of the MSR cascade code are given from Corollary 1 as (α, β, F=((k−1)k, (k−1)k-1, k(k−1)k). On the hand, the parameters of the PM code are given by (α, β, F)=(k−1,1, k(k−1)), and its message matrix can be written as








M


=

[




S
1






S
2




]


,




where S1 and S2 are α×α symmetric matrices. Hence, the number data symbols in M′ is







2


(




α
+
1





2



)


=



(

α
+
1

)


α

=


k


(

k
-
1

)


.







The encoder matrix is of the form

Ψ′=[Φ|ΛΦ],  (101)


where Φ is an n×α matrix and Λ is an n×n diagonal matrix. The matrices Φ and Λ may be chosen that

    • any d rows of Ψ′ are linearly independent.
    • any α rows of Φ are linearly independent.
    • the diagonal entries of Λ are distinct.


Even though the requirements above are different for the proposed code construction, a Vandermonde matrix satisfies conditions both constructions. The MSR PM code is given by custom character′=Ψ′·M′. So, in order for a fair comparison, one needs to concatenate N=(k−1)k-1 copies of little PM codes with independent message matrices M′1, M′2, . . . , M′N to obtain a code with the same parameters as the cascade code.


The techniques described herein denote by Reph→f and Rep′h→f the vector space spanned by the repair symbols sent by a helper node h to repair of a failed node f for the cascade and PM codes, respectively. Then, the following proposition highlights at least one important difference in the structure of two codes, to show that they are not similar with respect to the above definition.


Proposition 6: For an MSR code for an (n, k, d=2k−2) distributed storage system with parameters (α, β, F)=((k−1)k, (k−1)k-1, k(k−1)k), and three distinct nodes h, f, and g,










dim


(


Rep

h

f




Rep

h

g



)


=




m
=
0

μ





(

d
-
k

)


μ
-
m




[


2


(




k
-
1






m
-
1




)


-

(



k




m



)

-

(




k
-
2





m



)


]







(
102
)










while

















dim


(


Rep

h

f





Rep

h

g




)


=
0.





(
103
)







Proof of Proposition˜\refprop:diff: Recall that the repair data Reph→f is simply the concatenation of repair data for each code segment, which is a (modified) signed determinant code. For evaluation for the overlap between two subspaces spanned by the repair symbols sent for two failed nodes for any code segment: for code segment with mode m, the dimension of the overlap between the subspaces spanned by the repair symbols sent from h to f and g is given by







2


(




d
-
1






m
-
1




)


-


[


(



d




m



)

+

(




d
-
2





m



)


]

.






Hence, summing up over all code segments,










dim


(


Rep

h

f




Rep

h

g



)


=





m
=
0

μ





m



[


2


(




d
-
1






m
-
1




)


-

(



d




m



)

-

(




d
-
2





m



)


]



=





m
=
0

μ





m



[


2






β
m


-

α
m

+

(




d
-
2





m



)


]



=





m
=

-








p

μ
-
m




[


2


β
m


-

α
m

+

(




d
-
2





m



)


]



=




m
=
0

μ






(

d
-
k

)


μ
-
m




[


2


(




k
-
1






m
-
1




)


-

(



k




m



)

-

(




k
-
2





m



)


]


.









(
104
)







where (104) used (37) and (41), for the first and second summands, respectively. Moreover, the third summand in the summation can be simplified using Lemma 2 for a=−2 and b=0.


Next, note that resulting PM code is obtained from concatenating independent little PM codes, each with β=1. Hence, for each little PM code, the overlap between the spaces spanned by repair symbols sent for two failed nodes may be of dimension either 0 or 1. Assume the latter holds. Then the repair symbol sent from h to f and g are identical (up to a multiplicative constant), i.e., Rep′h→f=Rep′h→g. By symmetry, the same may hold for any other failed node. Now, consider a set of helper nodes custom character with |custom character|=d, and a set of failed nodes with custom character with |custom character|=k. The techniques described herein can repair the entire set custom character by sending downloading only β=1 symbol from each of the helper nodes in custom character, since Rep′h→f=Rep′h→g for any f, g∈custom character and any h∈custom character. On the other hand, the entire file of the little PM code may be recoverable for the content of nodes in custom character. This implies








k


(

k
-
1

)


=


F





h






dim


(

Rep

h

f



)




=


d





β

=

2


(

k
-
1

)





,




which is in contradiction with k>2. Therefore, the techniques described herein have dim(Rep′h→f∩Rep′h→g)=0 for any little PM code. Summing up over all copies of independent little PM codes the techniques described herein obtain the claim of the proposition.


An immediate consequence of this proposition is that cascade codes and PM codes are not similar, and cannot be converted to each other by any scaling and mapping of the raw or coded symbols.


The parity equation (7) and the redundant symbols in the message matrices play a critical role in the proposed code construction. Such symbols were initially introduced to facilitate node repair in the original determinant codes, without being important for data recovery. While having such redundancy could cause an inefficiency for the overall storage capacity of the code, the lower bounds show that determinant codes are optimum for d=k.


These redundant symbols play two roles in the cascade codes: (1) they help with the repair mechanism, similar to their role in determinant codes, and (2) they make the data recovery possible, in spite of the fact that the data collector only accesses k<d nodes. More intuitively, this redundancy is used to provide a backup copy of the symbols who could be missed in data recovery, due to the fact that a k×d sub-matrix of the encoder, i.e., Ψ[custom character, :], is not invertible for k<d.


Surprisingly, the number of such missing symbols (evaluated in (16)) is exactly the same as the number of redundant symbols in the entire code (given in (25)). This suggests the proposed code has no further room to be improved. On the other hand, the proposed construction universally achieves the optimum trade-off for MBR codes (see Corollary 1), MSR codes (see Corollary 1), an interior operating point on the cut-set bound (see Corollary 2), codes with k=d, and for an (n, k, d)=(5,3,4) system, for which a matching lower bound is provided. These facts altogether support the conjecture that the cascades codes are optimum exact regenerating codes for any set of parameters (n, k, d).


These evidences altogether support the injection that the cascades codes are optimum exact regenerating codes for any set of parameters (n, k, d). Of course, a matching lower bound for the trade-off is needed to prove the conjecture.


The main remaining problem to be addressed in this direction is to provide a lower bound for the trade-off between the storage and repair-bandwidth for exact-repair regenerating codes. As mentioned above, a tight lower bound may match with the trade-off achieved in this disclosure, indicating the optimality of the proposed construction.


There are some aspects of this work that can be improved. For instance, even though the sub-packetization of the codes provided in this disclosure is independent of the number of nodes n, it is exponential in parameter k. An interesting question is to develop construction that achieves the same trade-off with smaller sub-packetization and independent of n. Multiple failure repair is another interesting problem to be studied. More importantly, the problem of dynamic repair, referring to the flexibility of dynamically changing number of helper nodes (d∈[k: n−1]) is of both practical and theoretical interest.


There is recent attention to the clustered distributed storage systems. A modified version of the proposed construction might be applicable to such clustered systems. Finally, the exact-repair regenerating codes can be viewed in the context of interference alignment problem, where the repair scenario is equivalent to aligning and canceling the interference (mismatch) exists between a failed symbol and a coded symbol of a helper node. Therefore, the techniques and results of this disclosure might be also applicable to the design of interference alignment codes for wireless communication.


It is to be recognized that depending on the example, certain acts or events of any of the techniques described herein can be performed in a different sequence, may be added, merged, or left out altogether (e.g., not all described acts or events are necessary for the practice of the techniques). Moreover, in certain examples, acts or events may be performed concurrently, e.g., through multi-threaded processing, interrupt processing, or multiple processors, rather than sequentially.


In this section the proof of Proposition 2 is presented. From the RHS of (11),













x

𝒥






(

-
1

)




σ
D

(
x
)

+


ind
𝒥

(
x
)



[


R
f

(
D
)

]


x
,

𝒥


{
x
}





=





x

𝒥






(

-
1

)




σ
D

(
x
)

+


ind
𝒥

(
x
)



[

D
·

Ξ

f
,

(
m
)




]


x
,

𝒥


{
x
}





=





x

𝒥





(

-
1

)




σ
D

(
x
)

+


ind
𝒥

(
x
)












[
d
]






"\[LeftBracketingBar]"




"\[RightBracketingBar]"


=
m





D

x
,




·

Ξ


,

𝒥

\


{
x
}




f
,

(
m
)







=




x

𝒥





(

-
1

)




σ
D

(
x
)

+


ind
𝒥

(
x
)








y



[
d
]



(

𝒥


{
x
}


)






D



x
,

𝒥


{
x
}



)



{
y
}




·

Ξ




𝒥

\


{
x
}


)



{
y
}


,

𝒥

\


{
x
}




f
,

(
m
)













(
105
)












=




x

𝒥





(

-
1

)




σ
D

(
x
)

+


ind
𝒥

(
x
)



[



D

x
,
𝒥





Ξ

𝒥
,

𝒥

\


{
x
}




f
,

(
m
)




+





y



[
d
]


𝒥





D



x
,

𝒥


{
x
}



)



{
y
}



·

Ξ




𝒥

\


{
x
}


)



{
y
}


,

𝒥

\


{
x
}




f
,

(
m
)






]






(
106
)












=




x

𝒥





(

-
1

)




σ
D

(
x
)

+


ind
𝒥

(
x
)



[







(

-
1

)




σ
D

(
x
)

+


ind
𝒥

(
x
)





ψ

f
,
x




D

x
,
𝒥



+




y



[
d
]


𝒥






(

-
1

)




σ
D

(
x
)

+


ind


(

𝒥

\


{
x
}


)



{
y
}



(
y
)





ψ

f
,
x




D



x
,

𝒥

\


{
x
}



)



{
y
}






]









(
107
)












=





x

𝒥




ψ

f
,
x




D

x
,
𝒥




+




y



[
d
]


𝒥







x

𝒥





(

-
1

)




σ
D

(
x
)

+


σ
D

(
y
)

+


ind
𝒥

(
x
)

+


ind


(

𝒥

\


{
x
}


)



{
y
}



(
y
)





ψ

f
,
y




D



x
,

𝒥

\


{
x
}



)



{
y
}











(
108
)












=





x

𝒥




ψ

f
,
x




D

x
,
𝒥




+




y



[
d
]


𝒥







x

𝒥






(

-
1

)




σ
D

(
x
)

+


σ
D

(
y
)

+

ind

𝒥


{
y
}







(
y
)

+


ind

𝒥


{
y
}



(
x
)

+
1




ψ

f
,
y




D



x
,

𝒥

\


{
x
}



)



{
y
}











(
109
)












=





x

𝒥




ψ

f
,
x




D

x
,
𝒥




+




y



[
d
]


𝒥






(

-
1

)




σ
D

(
y
)

+


ind

𝒥


{
y
}



(
y
)

+
1





ψ

f
,
y







x

𝒥





(

-
1

)



ind

𝒥


{
y
}



(
x
)





(

-
1

)



σ
D

(
x
)




D



x
,

𝒥

\


{
x
}



)



{
y
}












(
110
)












=





x

𝒥




ψ

f
,
x




D

x
,
𝒥




+




y



[
d
]


𝒥






(

-
1

)




σ
D

(
y
)

+


ind

𝒥


{
y
}



(
y
)

+
1




ψ

f
,
y







x

𝒥





(

-
1

)



ind

𝒥


{
y
}



(
x
)




w

x
,

𝒥


{
y
}













(
111
)












=





x

𝒥




ψ

f
,
x




D

x
,
𝒥




+




y



[
d
]


𝒥






(

-
1

)




σ
D

(
y
)

+


ind

𝒥


{
y
}



(
y
)

+
1





ψ

f
,
y


[



(

-
1

)




ind

𝒥


{
y
}



(
y
)

+
1




w

y
,

𝒥


{
y
}





]








(
112
)












=





x

𝒥




ψ

f
,
x




D

x
,
𝒥




+




y



[
d
]


𝒥






(

-
1

)




σ
D

(
y
)

+


ind

𝒥


{
y
}



(
y
)

+
1





ψ

f
,
y


[



(

-
1

)




ind

𝒥


{
y
}



(
y
)

+
1





(

-
1

)



σ
D

(
y
)




D

y
,
𝒥



]








(
113
)












=





x

𝒥




ψ

f
,
x




D

x
,
𝒥




+




y



[
d
]


𝒥





ψ

f
,
y




D

y
,
𝒥









(
114
)













=





x


[
d
]





ψ

f
,
x




D

x
,
𝒥




=


[


Ψ
f

·
D

]

𝒥



,




(
115
)







where

    • In (105) the definition of Ξf,(m) is from (9), where the entry custom character is non-zero only if custom character includes custom character\{x}. This implies that for non zero custom character, custom character may satisfy custom character=(custom character\{x})∪{y} for some y∈[d]\(custom character{x});
    • In (106), the summation is split into two cases: y=x and y≠x;
    • In (107), custom character is replaced by custom character from its definition in (9);
    • In (108) the two summations over x and y are swapped;
    • In (109), the definition of ind·(⋅) function is used to write















ind

𝒥


{
y
}





(
x
)


+

i

n



d

𝒥


{
y
}





(
y
)




=






{

u



𝒥



{
y
}



:


u



x


}



+



{

u



𝒥



{
y
}



:


u



y


}










=






{

u



𝒥


:


u


x


}



+

1


[

y

x

]


+













{

u




(

𝒥


[
x
}


)




{
y
}



:


u



y


}



+

1


[

x

y

]










=





ind
𝒥



(
x
)


+

i

n



d


(

𝒥

\


{
x
}


)



{
y
}





(
y
)



+
1


,















where the last equality holds since x∈custom character and y∈[d]\custom character, which implies x≠y, and therefore 1[x≤y]+1[y≤x]=1. This leads to custom character(x)+custom charactercustom character+custom character+1 modulo 2.

    • In (111) the definition of D is used from (8): since x∉(custom character\{x})∪{y} then










D

x
,


(

𝒥


{
x
}


)



{
y
}




=



(

-
1

)



σ
D



(
x
)






w

x
,

𝒥


{
y
}




.















A similar argument is used in (113);

    • In (112) the parity equation from (7) is used. In particular,














x


𝒥


{
y
}








(

-
1

)



ind

𝒥


{
y
}





(
x
)





w

x
,

I


{
y
}






=
0

,













which implies













x

𝒥






(

-
1

)



ind

𝒥


{
y
}





(
x
)





w

x
,

I


{
y
}






=


-


(

-
1

)


i

n



d

J


{
y
}





(
y
)








w

y
,

I


{
y
}




.














This completes the proof of Proposition 2.


Remark 9: Note that in the chain of equations above the techniques described herein aim to repair the coded symbol at position custom character of the failed node, which is a linear combination of symbols in column custom character of the message matrix. However, the linear combination in (110) misses some of the symbols of column custom character (i.e., custom character when x∉custom character) and includes symbols from columns of the message matrix (i.e., custom character when custom charactercustom character). However, these two interference perfectly cancel each other due to the parity equation in (7). This is identical to the notion of interference neutralization, which is well studied in multi-hop wireless networks.


Definition 4: The techniques described herein call a (signed) determinant or cascade code semi-systematic if the first k nodes store symbols from the message matrix, without any encoding. The encoder of a semi-systematic code may consist of a k×k identity matrix in its upper-left and a k×(d−k) zero matrix in its upper-right corner.


Consider an (n, k, d) regenerating code obtained from an encoder matrix Ψ. Here, the techniques described herein show that the techniques described herein can modify the encoder matrix such that the resulting code becomes semi-systematic, that is, the first k nodes store pure symbols from the message matrix. Consider a general encoder matrix







Ψ

n
×
d


=


[


Γ

n
×
k








Y

n
×

(

d
-
k

)




]

=


[




A

k
×
k





B

k
×

(

d
-
k

)








C


(

n
-
k

)

×
k





D


(

n
-
k

)

×

(

d
-
k

)






]

.






Any k rows of Γn×k are linearly independent. Thus, Ak×k is a full-rank and invertible matrix.






X
=

[




A

k
×
k


-
1






-

A

k
×
k


-
1





B

k
×

(

d
-
k

)









O


(

d
-
k

)

×
k





I


(

d
-
k

)

×

(

d
-
k

)






]





Note that








X

-
1


=

[




A

k
×
k





B

k
×

(

d
-
k

)








O


(

d
-
k

)

×
k





I


(

d
-
k

)

×

(

d
-
k

)






]


,




and hence X is a full-rank matrix.


Therefore, the techniques described herein can modify the encoder matrix to








Ψ
~


n
×
d


=



Ψ

n
×
d


·
X

=


[




I

k
×
k





O

k
×

(

d
-
k

)








C


A

-
1






D
-

C


A

-
1



B





]

=


[



Γ
~


n
×
k









Υ
~


n
×

(

d
-
k

)




]

.







It is easy to verify that {tilde over (Ψ)} satisfy conditions. To this end, let custom character be an arbitrary set of rows indices with |custom character|=k. The techniques described herein have {tilde over (Γ)}[custom character, :]=−Γ[custom character, :]Ak×k−1 which is a full-rank matrix, since both Γ[custom character, :] and Ak×k−1 are full-rank. This shows the condition holds for {tilde over (Ψ)}. Similarly, for an arbitrary set custom character⊆[n] with |custom character|=d{tilde over (Ψ)}[custom character, :]=Ψ[custom character, :]X, which is again full-rank, because both Ψ[custom character, :] are X full-rank.


The techniques described herein may use the custom character transform to solve the recursive equation in (33) for custom characterm's, and evaluate the code parameters in (27), (28), and (32). For the sake of completeness, the techniques described herein start with the definition and some of the main properties of this transformation.


Definition 5: The two-sided custom character-transform of a sequences xm is defined as










X


(
z
)


=


𝒵


{

x
m

}


=




m
=

-








x
m



z

-
m









(
116
)







where z is a complex number. The region of convergence (ROC) of X(z) is defined as the set of points in the complex plane (z∈custom character) for which X(z) converges, that is,









ROC
=

{


z


:








m
=

-








x
m



z

-
m







<


}





(
117
)







Definition 6: The inverse custom character-transform of X(z) is defined as a sequence {xm}m=−∞ where











x
m

=



𝒵

-
1




{

X


(
z
)


}


=


1

2

π

j






C




X


(
z
)




z

m
-
1



d

z





,

m



,




(
118
)







where C is a counterclockwise closed path encircling the origin and entirely located in the region of convergence (ROC).


For a given ROC, there is a one-to-one correspondence between the sequences xm and its custom character-transform, X(z). Some properties of the custom character-transform as well as some pairs of sequences and their custom character-transforms are listed in Table 3 and Table 4, respectively.









TABLE 3







Properties of the custom character -transform.











Time Domain
Z-Domain
ROC





Linearity
wm = axm + bym
W(z) = aX(z) + bY(z)
ROCx ∩ ROCy


Convolution
wm = xm * ym
W(z) = X(z)Y(z)
ROCx ∩ ROCy





Differentia- tion
wm = mxm





W


(
z
)


=


-
z




dX


(
z
)


dz






ROCx





Scaling in
a−mxm
X(a · z)
ROCx/|a|


the z-domain








(Generalized) Accumula- tion





w
m

=




t
=

-



m




a

m
-
t




x
t












W


(
z
)


=


1

1
-

az

-
1






X


(
z
)







ROCx ∩ {z: |z| > |a|}





Time shifting
wm = xm−b
W(z) = z−bX(z)
ROCx
















TABLE 4







Some useful pairs of custom character -transform.









Sequence

custom character -Transform

ROC





xm = δ(m)
X(z) = 1
all z ∈  custom character










x
m





=

(



r




m



)







X(z) = (1 + z−1)r
all z ∈  custom character










x
=


(




m
+
b
-
1





m



)



a
m



,

b



+











X


(
z
)


=

1


(

1
-

az

-
1



)

b






|x| > |a|










x
=


(



b




m



)



a
m



,

b



+






X(z) = (1 + az−1)b
all z ∈  custom character









The techniques described herein start from the definition of pm and use (33) to obtain a recursive equation. For any m with m≠0,










p
m

=




μ
-
m


=




j
=


(

μ
-
m

)

+
1


μ






j

·

(

j
-

(

μ
-
m

)

-
1

)




(




d
-
k
+
1






j
-

(

μ
-
m

)





)








(
119
)






=




t
=
1

m







t
+
μ
-
m


·

(

t
-
1

)




(




d
-
k
+
1





t



)







(
120
)







=




t
=
1

m





p

m
-
t


·

(

t
-
1

)




(




d
-
k
+
1





t



)




,




(
121
)







where (119) is implied by (33), and in (120) used a change of variable t=j−μ+m. Note that the summand in (121) corresponding to






t
=


0





in







p

m
-
0


·

(

0
-
1

)




(




d
-
k
+
1





0



)


=

-


p
m

.








Hence, by including t=0 in the summation,














t
=
0

m





p

m
-
t


·

(

t
-
1

)




(




d
-
k
+
1





t



)



=
0

,

m

0.





(
122
)







Finally, for m=0,













t
=
0

0





p

0
-
t


·

(

t
-
1

)




(




d
-
k
+
1





t



)



=


-

p
0


=


-



μ
-
0



=

-
1.







(
123
)







Putting (122) and (123) together,













t
=
0

m





p

m
-
t


·

(

t
-
1

)




(




d
-
k
+
1





t



)



=


-

δ
m


·



m



.








(
124
)







Next, define a sequence







q
t

=


(

t
-
1

)



(




d
-
k
+
1





t



)







for every integer t. Note that qt=0 for t<0. Then (124) can be rewritten as










-

δ
m


=





t
=
0

m





p

m
-
t


·

(

t
-
1

)




(




d
-
k
+
1





t



)



=





t
=
0

m




p

m
-
t


·

q
t



=




t
=

-








p

m
-
t


·

q
t









(
125
)












=


p
m

*

q
m



,





(
126
)







where (125) holds since qt=0 for t<0, and pm−t=custom characterμ+(t−m)=0 is zero for t>m (see definition of custom characterm in (34)). Here, operator * denotes the convolution between two sequences pm and qm. The techniques described herein can take the custom character-transform from both sides of (126) and use Table 3 and Table 4 to obtain

P(z)Q(z)=−1.  (127)


The custom character-transform of qm can be easily found using the property and pairs used from Table 3 and Table 4 is follows.










Q


(
z
)


=


𝒵


{

q
m

}


=


𝒵


{


(

m
-
1

)



(




d
-
k
+
1





m



)


}


=


𝒵


{

m


(




d
-
k
+
1





m



)


}


-

𝒵


{

(




d
-
k
+
1





m



)

}









(
128
)











=



-
z








d





𝒵


{

(




d
-
k
+
1





m



)

}


dz


-

𝒵


{

(




d
-
k
+
1





m



)

}








(
129
)











=



-
z



d
dz




(

1
+

z

-
1



)


d
-
k
+
1



-


(

1
+

z

-
1



)


d
-
k
+
1








(
130
)






=




-

z


(

d
-
k
+
1

)





(

-

z

-
2



)




(

1
+

z

-
1



)


d
-
k



-


(

1
+

z

-
1



)


d
-
k
+
1



=




(

1
+

z

-
1



)


d
-
k




[



(

d
-
k

)



z

-
1



-
1

]


.






(
131
)







where (128) holds due to linearity of the custom character-transform, in (129) the techniques described herein used the differentiation effect, and the techniques described herein used the fourth pair in Table 4 for a=1 and b=d−k+1 in (130). Plugging (131) into (127),











P


(
z
)


=



-
1


Q


(
z
)



=


1

1
-


(

d
-
k

)



z

-
1








(

1

1
+

z

-
1




)


d
-
k





,




(
132
)







where the region of convergence is given by ROCp={z:|z|>|d−k|}. It remains to find pm from P(z) by computing its inverse custom character-transform.










p
m

=



𝒵

-
1




{

P


(
z
)


}


=




t
=

-



m





(

d
-
k

)


m
-
t




𝒵

-
1




{


(

1

1
+

z

-
1




)


d
-
k


}








(
133
)






=




t
=

-



m






(

d
-
k

)


m
-
t


·


(

-
1

)

t




(




t
+
d
-
k
-
1





t



)







(
134
)






=




t
=
0

m





(

-
1

)

t




(

d
-
k

)


m
-
t





(




t
+
d
-
k
-
1





t



)

.







(
135
)







where in (133) the techniques described herein used the generalized accumulation rule in Table 3 for a=d−k. It is worth mentioning that the inverse custom character-transform of







(

1

1
+

z

-
1




)


d
-
k






may be taken with respect to variable t. To this end, in (134) the techniques described herein have used the third pair in Table 4 with a=−1 and b=d−k. Finally, in (135) the techniques described herein have limited the range of t by noticing the fact that the binomial coefficient is zero for t<0. This shows the desired identity and completes the proof.


Let us define








u
μ

=





m
=

-









m



(




d
+
a






m
+
b




)



=




m
=

-








p

μ
-
m




(




d
+
a






m
+
b




)





,






v
μ

=





m
=

-
b


μ





(

d
-
k

)


μ
-
m




(




k
+
a






m
+
b




)



=




m
=
0


μ
+
b






(

d
-
k

)


μ
+
b
-
m




(




k
+
a





m



)





,




for every integer μ. The claim of this lemma is equivalent to uμ=vμ for all μ∈custom character. Instead of directly showing in the μ-domain, the techniques described herein may prove that the two sequences are identical in the z-domain, and have the same ROCs.










U


(
z
)


=


𝒵


{




m
=

-








p

μ
-
m




(




d
+
a






m
+
b




)



}


=


𝒵


{


p
μ

*

(




d
+
a






μ
+
b




)


}


=

𝒵



{

p
μ

}

·
𝒵



{

(




d
+
a






μ
+
b




)

}








(
136
)











=



P


(
z
)


·

z
b



𝒵


{

(




d
+
a





μ



)

}







(
137
)











=


1

1
-


(

d
-
k

)



z

-
1









(

1
+

z

-
1



)


d
+
a


·

z
b

·


(

1

1
+

z

-
1




)


d
-
k









(
138
)












=


z
b



1

1
-


(

d
-
k

)



z

-
1








(

1
+

z

-
1



)


k
+
a




,





(
139
)







where in (136) and (137) the techniques described herein used the convolution and time-shift properties from Table 3, respectively. Moreover, the techniques described herein have used (132) and the custom character-transforms in Table 4 to simplify (138). Note that ROCu=ROCp={z:|z|>|d−k|}.


Similarly, for sequence {vμ}










V


(
z
)


=


𝒵


{




m
=
0


μ
+
b






(

d
-
k

)


μ
+
b
-
m




(




k
+
a





m



)



}


=



z

-

(

-
b

)



·
𝒵



{




m
=
0

μ





(

d
-
k

)


μ
-
m




(




k
+
a





m



)



}







(
140
)











=



z
b

·
𝒵



{




m
=

-



μ





(

d
-
k

)


μ
-
m








(




k
+
a





m



)



}







(
141
)











=



z
b

·

1

1
-


(

d
-
k

)



z

-
1







𝒵


{

(




k
+
a





m



)

}







(
142
)











=



z
b

·

1

1
-


(

d
-
k

)



z

-
1









(

1
+

z

-
1



)


k
+
a








(
143
)







where in (140) and (142) the techniques described herein used time-shift property and generalized accumulation property from Table 3, respectively. Moreover, (141) holds because







(




k
+
a





m



)







is zero for m<0, and (143) follows the pairs of custom character-transform in Table 4. Also, it is worth noting that the ROC of {uμ} is given by ROCv={z:|z|>|d−k|}, due to the step in (142). Comparing (138) and (143) and their ROCs imply that sequences {uμ} and {vμ} are identical. This completes the proof of the lemma.


Proof of proposition˜\refprop:node:rep: The proof technique here is similar to that used in (47). The techniques described herein start with the RHS of (9), and plugin the entries of matrices Rf,(m), Ξf,(m) and D, to expand it. Next, the techniques described herein split the terms in the summation into v-symbols (custom character with x∈custom character) and w-symbols (custom character with y∉custom character), and then the techniques described herein prove the identity for v and w symbols separately. The details of the derivation are given at the top of the next page.













x

𝒥







(

-
1

)



ind
𝒥



(
x
)





[

R

f
,

(
m
)



]



x
,

𝒥

\


{
x
}





=





x

𝒥







(

-
1

)



ind
𝒥



(
x
)





[

D
·

Ξ

f
,

(
m
)




]



x
,

𝒥

\


{
x
}





=





x

𝒥






(

-
1

)



ind
𝒥



(
x
)












[
d
]








=
m






D

x
,



·

Ξ


,

𝒥

\


{
x
}




f
,

(
m
)







=




x

𝒥






(

-
1

)



ind
𝒥



(
x
)








y



[
d
]



\(



𝒥

\


{
x
}


)











D



x
,

𝒥

\


{
x
}



)



{
y
}



·

Ξ




𝒥

\


{
x
}


)



{
y
}


,

𝒥

\


{
x
}




f
,

(
m
)













(
144
)






=




x

𝒥






(

-
1

)



ind
𝒥



(
x
)



[







D

x
,
𝒥




Ξ

𝒥
,

𝒥

\


{
x
}




f
,

(
m
)




+




y



[
d
]


\

𝒥






D



x
,

𝒥

\


{
x
}



)



{
y
}



·

Ξ




𝒥

\


{
x
}


)



{
y
}


,

𝒥

\


{
x
}




f
,

(
m
)






]






(
145
)






=




x

𝒥






(

-
1

)



ind
𝒥



(
x
)



[








(

-
1

)



ind
𝒥



(
x
)





ψ

f
,
x




D

x
,
𝒥



+




y



[
d
]


\

𝒥







(

-
1

)



ind


(

𝒥

\


{
x
}


)



{
y
}





(
y
)





ψ

f
,
y




D



x
,

𝒥

\


{
x
}



)



{
y
}






]






(
146
)






=





x

𝒥





ψ

f
,
y




D

x
,
𝒥




+




y



[
d
]


\

𝒥








x

𝒥






(

-
1

)



ind
𝒥



(
x
)






(

-
1

)



ind


(

𝒥

\


{
x
}


)



{
y
}





(
y
)





ψ

f
,
y








D



x
,

𝒥

\


{
x
}



)



{
y
}











(
147
)






=





x

𝒥





ψ

f
,
y




D

x
,
𝒥




+




y



[
d
]


\

𝒥








x

𝒥







ψ

f
,
y




(

-
1

)





ind

𝒥


{
y
}





(
y
)


+
1





(

-
1

)



ind

𝒥


{
y
}





(
x
)





D



x
,

𝒥

\


{
x
}



)



{
y
}











(
148
)






=





x

𝒥





ψ

f
,
y




D

x
,
𝒥




+




y



[
d
]


\

𝒥







(

-
1

)




ind

𝒥


{
y
}





(
y
)


+
1




ψ

f
,
y







x

𝒥






(

-
1

)



ind

𝒥


{
y
}





(
x
)





w

x
,

𝒥


{
y
}













(
149
)






=





x

𝒥





ψ

f
,
y




D

x
,
𝒥




+




y



[
d
]


\

𝒥







(

-
1

)




ind

𝒥


{
y
}





(
y
)


+
1





ψ

f
,
y




[



(

-
1

)




ind

𝒥


{
y
}





(
y
)


+
1




w

y
,

𝒥


{
y
}





]









(
150
)






=





x

𝒥





ψ

f
,
y




D

x
,
𝒥




+




y



[
d
]


\

𝒥







(

-
1

)




ind

𝒥


{
y
}





(
y
)


+
1





ψ

f
,
y




[



(

-
1

)




ind

𝒥


{
y
}





(
y
)


+
1




D

y
,
𝒥



]









(
151
)











=





x

𝒥





ψ

f
,
y




D

x
,
𝒥




+




y



[
d
]


\

𝒥






ψ

f
,
y




D

y
,
𝒥










(
152
)











=





x


[
d
]






ψ

f
,
y




D

x
,
𝒥




=


[


ψ
f


D

]

𝒥







(
153
)







The critical steps of the proof can be justified as follows.

    • In (144), the definition of Ξf,(m) from (7) is used, which implies custom character is non-zero only if custom character includes custom character\{x};
    • In (145), the summation is split into two cases: y=x and y≠x;
    • In (146), custom character is replaced by custom character from its definition in (7);
    • In (147), the two summations over x and y are swapped;
    • In (148), the identity












(

-
1

)



ind


(

𝒥

\


{
x
}


)



{
y
}





(
y
)






(

-
1

)


i

n



d
𝒥



(
x
)





=



(

-
1

)




ind

(

𝒥


{
y
}


)




(
y
)


+
1





(

-
1

)


i

n



d

𝒥


{
y
}





(
x
)


















is used. In order to prove the identity, consider two cases:

    • If x<y, then






{








ind


(

𝒥

\


{
x
}


)



{
y
}





(
y
)


=



ind

𝒥


{
y
}





(
y
)


-
1


,








ind
𝒥



(
x
)


=


ind





𝒥


{
y
}






(
x
)






.







    • If x>y, then









{








ind


(

𝒥

\


{
x
}


)



{
y
}





(
y
)


=


ind

𝒥


{
y
}





(
y
)



,








ind
𝒥



(
x
)


=



ind





𝒥


{
y
}






(
x
)


-
1





.







    • In (149), since x∉(custom character\{x})∪{y} then














D

x
,


(

𝒥


{
x
}


)



{
y
}




=

w

x
,

𝒥


{
y
}





;














    • In (150), the parity equation (5) is used. In particular,

















x


𝒥


{
y
}








(

-
1

)



ind

𝒥


{
y
}





(
x
)





w

x
,

I


{
y
}






=
0

,













which implies













x

𝒥






(

-
1

)



ind

𝒥


{
y
}





(
x
)





w

x
,

I


{
y
}






=


-


(

-
1

)


i

n



d

J


{
y
}





(
y
)








w

y
,

I


{
y
}




.














This completes the proof.


Proof of Proposition˜\reflm:beta: In order to show that the repair bandwidth constraint is fulfilled, the techniques described herein may show that the rank of matrix Ξf,(m) is at most







β

(
m
)


=


(




d
-
1






m
-
1




)

.






First, note that it is easy to verify the claim for m=1, since the matrix Ξf,(1) has only one column labeled by ø and hence its rank is at most






1
=


(




d
-
1






1
-
1




)

.






For m>1, the techniques described herein partition the columns of the matrix into 2 disjoint groups of size








β

(
m
)


=




(




d
-
1






m
-
1




)






and






(



d





m
-
1




)


-

β

(
m
)



=

(




d
-
1






m
-
2




)



,





and show that each column in the second group can be written as a linear combination of the columns in the first group. This implies that the rank of the matrix does not exceed the number of columns in the first group, which is exactly β(m).


To form the groups, the techniques described herein pick some x∈[d] such that ψf,x≠0. Then the first group is the set of all columns whose label is a subset of [d]\{x}. Recall that columns of Ξf,(m) are labeled with (m−1)-element subsets of [d]. Hence, the number of columns in the first group is







(




d
-
1






m
-
1




)

.





Then, the second group is formed by those columns for which x appears in their label.


Without loss of generality, x=d, i.e ψf,d≠0, and hence the first group consists of columns custom character such that custom character⊂[d−1], and the second group includes those custom character's such that d∈custom character. For every custom character with d∈custom character,











Ξ

:

,
𝒥



f
,

(
m
)



=



(

-
1

)

m



ψ

f
,
d


-
1







y



[

d
-
1

]


\

𝒥







(

-
1

)



ind

𝒥


{
y
}





(
y
)





ψ

f
,
y




Ξ

:

,

𝒥


{
y
}





f
,

(
m
)







,




(
154
)







where custom character=custom character\{d}. Note that all the columns appear in the RHS of (154) belong to the first group. Given the facts that |custom character|=m−1 and custom character(d)=m−1, the equation in (154) is equivalent to













y



[
d
]


\

𝒥







(

-
1

)



ind

𝒥


{
y
}





(
y
)





ψ

f
,
y




Ξ

:

,

𝒥


{
y
}





f
,

(
m
)





=
0.




(
155
)







Let us focus on an arbitrarily chosen row of the matrix, labeled by custom character, where custom character⊆[d] with |custom character|=m. The custom character-th entry of the column in the LHS of (155) is given by








[




y



[
d
]


𝒥







(

-
1

)



ind

𝒥


{
y
}





(
y
)





ψ

f
,
y




Ξ

:

,

𝒥


{
y
}





f
,

(
m
)





]



=




y



[
d
]


𝒥







(

-
1

)



ind

𝒥


{
y
}





(
y
)





ψ

f
,
y





Ξ


,

𝒥


{
y
}




f
,

(
m
)



.







First assume custom charactercustom character. This together with the definition of Ξf,(m) imply that custom character=0 for any y, and hence all the terms in the LHS of (155) are zero.


Next, consider an custom character such that custom charactercustom character. Since |custom character|=|custom character\{d}|=m−2 and |custom character|=m, the techniques described herein have custom character=custom character∪{y1, y2}, where y1<y2 are elements of [d]. Note that for y∉{y1, y2} the techniques described herein have custom character=0, since custom character∪{y}⊂custom character. Therefore, (155) can be simplified as











[




y



[
d
]


𝒥







(

-
1

)



ind

𝒥


{
y
}





(
y
)





ψ

f
,
y




Ξ

:

,

𝒥


{
y
}





f
,

(
m
)





]



=




y


{


y
1

,

y
2


}







(

-
1

)



ind

𝒥


{
y
}





(
y
)





ψ

f
,
y




Ξ


,

𝒥


{
y
}




f
,

(
m
)









(
156
)






=





(

-
1

)



ind

𝒥


{

y
1

}





(

y
1

)





ψ

f
,

y
1





Ξ


,

𝒥


{

y
1

}




f
,

(
m
)




+



(

-
1

)



ind

𝒥


{

y
2

}





(

y
2

)





ψ

f
,

y
2





Ξ


,

𝒥


{

y
2

}




f
,

(
m
)





=




(

-
1

)



ind

𝒥


{

y
1

}





(

y
1

)






ψ

f
,

y
1



·


(

-
1

)



ind




(

y
2

)






ψ

f
,

y
2




+



(

-
1

)



ind

𝒥


{

y
2

}





(

y
2

)






ψ

f
,

y
2



·


(

-
1

)



ind




(

y
1

)






ψ

f
,

y
1










(
157
)







=



ψ

f
,

y
1






ψ

f
,

y
2





[



(

-
1

)




ind



\


{

y
2

}





(

y
1

)


+


ind




(

y
2

)




+


(

-
1

)




ind



\


{

y
1

}





(

y
2

)


+


ind




(

y
1

)





]



=
0


,




(
158
)







where (156) and (157) is due to the definition of Ξf,(m), and in (158) the techniques described herein used the facts that custom character and custom character which holds for y1, y2custom character with y1<y2. This completes the proof.


Proof of Proposition˜\refprop:multi-beta: The claim of this proposition for e>d is equivalent to bounding the number of repair symbols from the helper node by







α
=


(



d




m



)







(


because






(




d
-
e





m



)


=
0

)



,





which is clearly true since each storage node does not store more than α symbols. Thus, the techniques described herein can limit the attention to e≤d. In order to prove the claim, the techniques described herein show that the row rank of custom character does not exceed







(



d




m



)

-


(




d
-
e





m



)

.






Recall that custom character is a







(



d




m



)

×

e


(



d





m
-
1




)







matrix, an it suffices to identify







(




d
-
e





m



)







linearly independent vectors in the left null-space of custom character. To this end, the techniques described herein introduce a full-rank matrix custom character of sized







(




d
-
e





m



)

×

(



d




m



)






and show that custom charactercustom character


Step 1 (Construction of custom character): Let Ψ[custom character, :] be the e×d matrix obtained from the rows f1, f2, . . . , fe of the encoder matrix Ψ. Recall that Ψ[custom character, :] is full-rank (since any d rows of Ψ are linearly independent, and e≤d). Hence, there exists a subset Q with |custom character|=e of the columns of Ψ[custom character, :], denoted by Ψ[custom character, custom character] such that det(Ψ[custom character, custom character])≠0.


The desired matrix custom character is of size








(




d
-
e





m



)

×

(



d




m



)


.





The techniques described herein label its rows by m-element subsets of [d]\custom character, and its columns by m-element subsets of [d]. Then the entry at row custom character and column custom character is defined as










Y

𝒥
,



ɛ
,

(
m
)



=

(







(

-
1

)

σ



det


(

Ψ


[

ɛ
,


(

𝒥


)


\




]


)








if









𝒥



,





0




if









𝒥






.






(
159
)







where σ=custom character(j). Note that custom character⊆[d]\custom character, and hence |custom charactercustom character|=m+e.


Step 2 (Orthogonality of custom character to custom character): The techniques described herein prove this claim for each segment of custom character. More precisely, for each f∈custom character, the techniques described herein prove custom character·Ξf,(m)=0. Consider some fcustom character, and arbitrary indices custom character and custom character for rows and columns, respectively.














[



Y

ɛ
,

(
m
)



·

=








[
d
]








=
m





·








(
160
)











=




x



[
d
]


\

𝒥





·







(
161
)











=



·

Ξ


𝒥


{
x
}


,
𝒥


f
,

(
m
)










(
162
)











=

[

(


-
1


×








(
163
)






det
(

Ψ
[

ɛ
,


(



)



\(








𝒥


{
x
}


)

]

)

]



[



(

-
1

)



ind

𝒥


{
x
}





(
x
)





ψ

f
,
x



]


=


(

-
1





[





(

×

det
(

Ψ
[

ɛ
,


(



)



\(







𝒥


{
x
}


)

]

)



ψ

f
,
x



]



















(
164
)






=



(

-
1

)






j

𝒥





ind






(
j
)



-
1




det
(

[




Ψ


[

f
,


(



)


\

𝒥


]







Ψ


[

ɛ
,


(



)


\

𝒥


]





]

)






(
165
)











=
0.





(
166
)







Note that

    • In (161) the techniques described herein have used the fact that μf,(m) is non-zero only if custom character=custom character∪{x} for some x∈[d]\custom character.
    • The range of x is further limited in (162) due to the fact that custom character is non-zero only if custom character∪{x}⊆custom charactercustom character, which implies x∈(custom charactercustom character)\custom character.
    • In (163), the entries of the matrix product are replaced by their definitions.
    • The equality in (164) is obtained by factoring custom character, and using the facts that custom charactercustom charactercustom character and











(
x
)


+



(
x
)







ind






(
x
)


-



ind

𝒥


{
x
}





(
x
)




(

mod





2

)




=





{

y







:






y



x


}



-



{

y






{
x
}



:






y



x


}




=





{

y







:






y



x


}



-



{

y





:






y


x


}



-
1

=



{

y


(





1


=




(
x
)


-
1.














    • The equality in (73) follows the determinant expansion of the matrix with respect to its first row. Note that |(custom charactercustom character)\custom character|=e+1, and hence it is a square matrix.

    • Finally, the determinant in (73) is zero, because f∈custom character, and hence the matrix has two identical rows.





Step 3 (Full-rankness of custom character): Recall that rows and columns of custom character are labeled by m-element subsets of [d]\custom character and m-element subsets of [d], respectively. Consider the sub-matrix of Ycustom character,(m), whose column labels are subsets of [d]\custom character. This is







(




d
-
e





m



)

×

(




d
-
e





m



)






square sub-matrix. Note that for entry at position (custom character,custom character) with custom charactercustom character, since custom charactercustom character=Ø, the techniques described herein have custom charactercustom charactercustom character, and hence custom character=0 (see (67)). Otherwise, if custom character=custom character the techniques described herein have








[

Y

ɛ
,

(
m
)



]


𝒥
,
𝒥


=


(

-
1

)






i

𝒥






ind

𝒥

Q




(
i
)





det


(

Ψ


[

ɛ
,
Q

]


)


.









That is






=

(







if






=
𝒥

,





0




if








𝒥
.










This implies that Ycustom character,(m) has a diagonal sub-matrix, with diagonal entries ±det(Ψ[custom character, custom character]) which are non-zero (see Step 1), and hence Ycustom character,(m) is a full-rank matrix. Therefore, the rows of Ycustom character,(m) provide







(




d
-
e





m



)







linearly independent vectors in the left null-space of custom character, and thus the rank of custom character does not exceed







β

(
m
)


=


(



d




m



)

-


(




d
-
e





m



)

.







This completes the proof.


In this section, the techniques described herein prove Theorem 3. As mentioned before, this result is essentially obtained from Theorem 2, by exploiting the fact that in the centralized repair setting once one failed node is repaired, it can also participate in the repair process of the remaining failed nodes.


Proof of Theorem˜\refthm:MulRep-improved. Consider a set of e failed nodes ε={f1, f2, . . . , fe}, which are going to be repaired by a set of helper nodes custom character with |custom character|=d. Recall from Theorem 2 that the repair data of node h intended for a failed node f (i.e., Ψh·D·Ξf,(m)) can be retrieved from the repair data that h sends for the repair of a set of failed nodes custom character (i.e., Ψh·D·custom character) where f∈custom character. The techniques described herein use the following procedure in order to repair the failed nodes in custom character.


1. First node f1 is repaired using helper nodes custom character={h1, h2, . . . , hd}.


2. Having failed {f1, . . . , fi} repaired, the repair process of failed node fi+1 is performed using helper nodes custom characteri={f1, . . . , fi}∪{hi+1, hi+2, . . . , hd}. This step may be repeated for i=2,3, . . . , e.


Note that the proposed repair process introduced in this disclosure is helper-independent, and hence the repair data sent to a failed node fi by a helper node h does not depend on the identity of the other helper nodes.


Using the procedure described above, helper node hi only participates in the repair of failed nodes {f1, f2, . . . , fi}, for i=1, 2, . . . , e, while the other helper nodes (i.e., hi for i=e+1, . . . , d) contribute in the repair of all the e failed nodes. Hence, the total repair data downloaded from the helper nodes in custom character to the central repair unit is given by













j
=
1

e



β
j

(
m
)



+


(

d
-
e

)



β
e

(
m
)







(
167
)






=






j
=
1

e



[






(



d




m



)

-





(




d
-
j





m



)


]


+


(

d
-
e

)

[






(



d




m



)

-





(




d
-
e





m



)


]


=


d


(



d




m



)


-




j
=
1

e



(




d
-
j





m



)


-


(

d
-
e

)



(




d
-
e





m



)








(
168
)











=


d


(



d




m



)


-

[






(



d





m
+
1




)

-





(




d
-
e






m
+
1




)


]

-


(

d
-
e

)



(




d
-
e





m



)








(
169
)






=



(

d
+
1

)







(



d




m



)


-

[






(



d




m



)

+





(



d





m
+
1




)


]

-


(

d
-
e
+
1

)



(




d
-
e





m



)


+

[






(




d
-
e





m



)

+





(




d
-
e






m
+
1




)


]






(
170
)






=



(

m
+
1

)







(




d
+
1






m
+
1




)


-

(




d
+
1






m
+
1




)

-


(

m
+
1

)



(




d
-
e
+
1






m
+
1




)


+

(




d
-
e
+
1






m
+
1




)






(
171
)












=

m
[






(




d
+
1






m
+
1




)

-





(




d
-
e
+
1






m
+
1




)


]


,





(
172
)







where

    • in (77) use the identity











j
=
a

b



(



j




m



)


=






j
=
m

b



(



j




m



)


-




j
=
m


a
-
1




(



j




i



)



=


(




b
+
1






m
+
1




)

-

(



a





m
+
1




)




,






    • and the equality in (171) holds due to the Pascal identity












(



a




m



)

+

(



a





m
+
1




)


=

(




a
+
1






m
+
1




)


,





and the fact that








(

a
+
1

)



(



a




m



)


=


(

m
+
1

)




(




a
+
1






m
+
1




)

.







Hence the average (per helper node) repair bandwidth is given by











β
_

e

(
m
)


=



1
d



[


m




(




d
+
1






m
+
1




)

-

m




(




d
-
e
+
1






m
+
1




)


]


.





(
173
)







Note that the repair strategy described here is asymmetric, i.e., the repair bandwidth of helper nodes are different. However, it can be simply symmetrized by concatenating d copies of the code to form a super code

custom character=Ψ·[D[1],D[2], . . . ,D[d]].


The participation of the helper nodes in the multiple failure repair of the super-code. Each cell labeled by hi and custom character shows the set of failed nodes receive repair data from helper node hi to repair their codeword segment corresponding to code segment custom character.


In the super code, each node stores a total of d·α(m) symbols, including α symbols for each code segment custom character, and the total storage capacity of the code is d·F(m). In a multiple failure scenario with failed nodes custom character={f1, . . . , fe} and helper nodes custom character={h1, . . . , hd}, codeword segments may be repaired separately. The role of helper nodes in the repair process changes in a circular manner. Then, the per-node repair bandwidth of the super code is exactly d·βe(m) defined above. Note that the symmetry in the super code is obtained at the expense of the sub-packetization, which is scaled by a factor of d. This completes the proof.



FIG. 8 is a conceptual diagram illustrating the participation of the helper nodes in the multiple failure repair of the super-code, in accordance with the techniques described herein. Each cell labeled by hi and D[l] shows the set of failed nodes receive repair data from helper node hi to repair their codeword segment corresponding to code segment D[l].



FIG. 9 is a block diagram of a detailed view of a node device that may be configured to perform one or more techniques in accordance with the current disclosure. FIG. 9 illustrates only one particular example of node device 12, and many other examples of node device 12 may be used in other instances and may include a subset of the components included in example node device 12 or may include additional components not shown in FIG. 9. For instance, node device 12 may include a battery or other power component or may not include a direct input component.


As shown in the example of FIG. 9, node device 12 includes one or more processors 60, one or more input components 62, one or more communication units 64, one or more output components 66, and one or more storage components 68. Storage components 68 of node device 12 include failure module 16. Communication channels 70 may interconnect each of the components 60, 64, 62, 66, and 68 for inter-component communications (physically, communicatively, and/or operatively). In some examples, communication channels 70 may include a system bus, a network connection, an inter-process communication data structure, or any other method for communicating data.


One or more communication units 64 of node device 12 may communicate with external devices via one or more wired and/or wireless networks by transmitting and/or receiving network signals on the one or more networks. Examples of communication units 64 include a network interface card (e.g. such as an Ethernet card), an optical transceiver, a radio frequency transceiver, a GPS receiver, or any other type of device that can send and/or receive information. Other examples of communication units 64 may include short wave radios, cellular data radios, wireless network radios, as well as universal serial bus (USB) controllers. Communication units 64 may be configured to operate in accordance with a wireless protocol, such as WiFi®, Bluetooth®, LTE, or ZigBee®.


One or more input components 62 of node device 12 may receive input. Examples of input are tactile, audio, and video input. Input components 62 of node device 12, in one example, includes a presence-sensitive input device (e.g., a touch sensitive screen, a PSD), mouse, keyboard, voice responsive system, video camera, microphone or any other type of device for detecting input from a human or machine. In some examples, input components 62 may include one or more sensor components one or more location sensors (GPS components, Wi-Fi components, cellular components), one or more temperature sensors, one or more movement sensors (e.g., accelerometers, gyros), one or more pressure sensors (e.g., barometer), one or more ambient light sensors, and one or more other sensors (e.g., microphone, camera, infrared proximity sensor, hygrometer, and the like). Other sensors may include a heart rate sensor, magnetometer, glucose sensor, hygrometer sensor, olfactory sensor, compass sensor, step counter sensor, to name a few other non-limiting examples.


One or more output components 66 of node device 12 may generate output. Examples of output are tactile, audio, and video output. Output components 66 of node device 12, in one example, includes a PSD, sound card, video graphics adapter card, speaker, cathode ray tube (CRT) monitor, liquid crystal display (LCD), or any other type of device for generating output to a human or machine.


One or more processors 60 may implement functionality and/or execute instructions associated with node device 12. Examples of processors 60 include application processors, display controllers, auxiliary processors, one or more sensor hubs, and any other hardware configure to function as a processor, a processing unit, or a processing device. Module 16 may be operable by processors 60 to perform various actions, operations, or functions of node device 12. For example, processors 60 of node device 12 may retrieve and execute instructions stored by storage components 68 that cause processors 60 to perform the operations of module 16. The instructions, when executed by processors 60, may cause node device 12 to store information within storage components 68.


One or more storage components 68 within node device 12 may store information for processing during operation of node device 12 (e.g., node device 12 may store data accessed by module 16 during execution at node device 12). In some examples, storage component 68 is a temporary memory, meaning that a primary purpose of storage component 68 is not long-term storage. Storage components 68 on node device 12 may be configured for short-term storage of information as volatile memory and therefore not retain stored contents if powered off. Examples of volatile memories include random-access memories (RAM), dynamic random-access memories (DRAM), static random-access memories (SRAM), and other forms of volatile memories known in the art.


Storage components 68, in some examples, also include one or more computer-readable storage media. Storage components 68 in some examples include one or more non-transitory computer-readable storage mediums. Storage components 68 may be configured to store larger amounts of information than typically stored by volatile memory. Storage components 68 may further be configured for long-term storage of information as non-volatile memory space and retain information after power on/off cycles. Examples of non-volatile memories include magnetic hard discs, optical discs, floppy discs, flash memories, or forms of electrically programmable memories (EPROM) or electrically erasable and programmable (EEPROM) memories. Storage components 68 may store program instructions and/or information (e.g., data) associated with modules 4 and 6. Storage components 68 may include a memory configured to store data or other information associated with module 16.


In accordance with the techniques described herein, node device 12 may be a helper node within a distributed storage system that includes a plurality of nodes as described throughout this disclosure. A total number of nodes in the distributed storage system is represented by n, a file stored in the distributed storage system is recovered from a subset of a number of nodes represented by k upon a file failure on a node in the distributed storage system, and a failed node in the plurality of nodes is recovered from a number of helper nodes of the plurality of nodes represented by d.


Some entity in the distributed storage system may detect a failure in a first node of the distributed storage system. Upon this detection, failure module 16, for a particular mode of a determinant code, the particular mode represented by m, determines a repair-encoder matrix having a number of rows based on a number of m-element subsets of d and having a number of columns based on a number of m−1-element subsets of d.


Failure module 16 multiplies a content matrix for the respective helper node by the repair-encoder matrix to obtain a repair matrix. A maximum number represented by b columns of the repair matrix are linearly independent, where b is based on a number of m−1-element subsets of d−1. Failure module 16 extracts each linearly independent column of the repair matrix and sends, using communication unit 64, the linearly independent columns of the repair matrix to the first node.


In some examples, the content matrix is a product of an encoder matrix and a message matrix. The encoder matrix is common to each of the helper nodes, and the message matrix is unique for every helper node. The encoder matrix may be a fixed matrix having n rows and d columns, and the encoder matrix may be maximum-distance-separable. The message matrix may have d rows, and one or more entries of the message matrix may be one or more of a source symbol or a parity symbol.


In some examples, the number of rows in the repair-encoder matrix may be equal to the number of m-element subsets of d and the number of columns in the repair-encoder matrix may be equal to the number of m−1-element subsets of d.


In some examples, b is equal to a number of m−1-element subsets of d−1.


In some examples, in sending the linearly independent columns of the repair matrix to the first node, failure module 16 may send only the linearly independent columns of the repair matrix to the first node, where the linearly independent columns of the repair matrix may form a repair space of the first node.


In some examples, each linearly independent column may be a repair-data vector.


In some examples, in detecting the failure in the first node, failure module 16 may detect a failure in each of a group of two or more nodes in the plurality of nodes, where the group of two or more nodes includes the first node. In such examples, in sending the linearly independent columns of the repair matrix to the first node, failure module 16 may send the linearly independent columns of the repair matrix to each of the group of two or more nodes concurrently.


In some examples, an achievable trade-off of the distributed storage system is independent of n. By increasing the number of nodes n in the system, the number of problem constraints grows quickly. The determinant codes introduced in techniques described herein do not have such limitations. By adding more nodes in the system, the system will be robust against a higher number of node failures.


In some examples, in multiplying the content matrix for the respective helper node by the repair-encoder matrix to obtain the repair matrix, failure module 16 may perform a linear multiplication. This is a desirable property from the practical perspective, in order to provide a computationally feasible encoding/decoding as well as system maintenance.


In some examples, the determinant code may be optimum for d=k, in the sense that the achievable trade-off is matching with a known lower bound for linear exact-regenerating codes. As a consequence, the optimum linear trade-off for regenerating codes with d=k is fully characterized.


In some examples, a required mathematical field size for the determinant code may be linear. As a larger field size, there is a higher demand for a larger amount of files to be grouped and encoded. This will reduce the flexibility of the system design.


In some examples, the linearly independent columns of the repair matrix for each helper node may be independent of every other helper node. That is, the data sent from each helper node only depends on the identity of the helper node and failed nodes, but is independent from the identity of other helper nodes participating in the repair process.


In some examples, failure module 16 may further concatenate a plurality of determinant codes to construct a merged determinant code, where each of the plurality of determinant codes is a d=k determinant code, and where the merged determinant code is a d>=k determinant code.


In some examples, a number of code parameters for the determinant code may be less than or equal to (d−k+1)k, which is independent of the number of the parity nodes.


While described as occurring in helper node 12, the techniques described above would be performed by each of the d helper nodes upon detecting the failure of the node.



FIG. 10 is a flow diagram of one or more techniques of the current disclosure. The operations of FIG. 10 may be performed by one or more processors of a computing device, such as node device 12 of FIG. 9. For purposes of illustration only, FIG. 10 is described below within the context of node device 12 of FIG. 9.


In accordance with the techniques described herein, detecting (130) a failure in a first node of a distributed storage system comprising a plurality of nodes. A total number of nodes in the distributed storage system is represented by n, a file stored in the distributed storage system is recovered from a subset of a number of nodes represented by k upon a file failure on a node in the distributed storage system, and a failed node in the plurality of nodes is recovered from a number of helper nodes of the plurality of nodes represented by d. Node device 12 in the distributed storage system, for a particular mode of a determinant code, the particular mode represented by m, determines (132) a repair-encoder matrix having a number of rows based on a number of m-element subsets of d and having a number of columns based on a number of m−1-element subsets of d. Node device 12 multiplies (134) a content matrix for the respective helper node by the repair-encoder matrix to obtain a repair matrix, wherein a maximum number represented by b columns of the repair matrix are linearly independent, wherein b is based on a number of m−1-element subsets of d−1. Node device 12 extracts (136) each linearly independent column of the repair matrix. Node device 12 sends (138) the linearly independent columns of the repair matrix to the first node.


The following numbered examples demonstrate one or more aspects of the disclosure.


Example 1. A distributed storage system comprising: a plurality of nodes comprising a first node and a number of helper nodes, wherein a total number of nodes in the distributed storage system is represented by n, wherein a file stored in the distributed storage system is recovered from a subset of a number of nodes represented by k upon a file failure on a node in the distributed storage system, and wherein a failed node in the plurality of nodes is recovered from the number of helper nodes of the plurality of nodes represented by d, wherein upon detecting a failure in the first node, each helper node of the number of helper nodes is configured to: for a particular mode of a determinant code, the particular mode represented by m, determine, by the respective helper node, a repair-encoder matrix having a number of rows based on a number of m-element subsets of d and having a number of columns based on a number of m−1-element subsets of d; multiply, by the respective helper node, a content matrix for the respective helper node by the repair-encoder matrix to obtain a repair matrix, wherein a maximum number represented by b columns of the repair matrix are linearly independent, wherein b is based on a number of m−1-element subsets of d−1; extract, by the respective helper node, each linearly independent column of the repair matrix; and send, by the respective helper node, the linearly independent columns of the repair matrix to the first node.


Example 2. The distributed storage system of example 1, wherein the content matrix comprises a product of an encoder matrix and a message matrix.


Example 3. The distributed storage system of example 2, wherein the encoder matrix is common to each of the helper nodes, and wherein the message matrix is unique for every helper node.


Example 4. The distributed storage system of any of examples 2-3, wherein the encoder matrix comprises a fixed matrix having n rows and d columns, and wherein the encoder matrix is maximum-distance-separable.


Example 5. The distributed storage system of any of examples 2-4, wherein the message matrix has d rows, and wherein one or more entries of the message matrix comprise one or more of a source symbol or a parity symbol.


Example 6. The distributed storage system of any of examples 1-5, wherein the number of rows in the repair-encoder matrix is equal to the number of m-element subsets of d and the number of columns in the repair-encoder matrix is equal to the number of m−1-element subsets of d.


Example 7. The distributed storage system of any of examples 1-6, wherein b is equal to a number of m−1-element subsets of d−1.


Example 8. The distributed storage system of any of examples 1-7, wherein to send the linearly independent columns of the repair matrix to the first node, the respective helper node is configured to send only the linearly independent columns of the repair matrix to the first node, wherein the linearly independent columns of the repair matrix form a repair space of the first node.


Example 9. The distributed storage system of any of examples 1-8, wherein each linearly independent column comprises a repair-data vector.


Example 10. The distributed storage system of any of examples 1-9, wherein to detect the failure in the first node, the distributed storage system is configured to detect a failure in each of a group of two or more nodes in the plurality of nodes, wherein the group of two or more nodes includes the first node, and wherein to send the linearly independent columns of the repair matrix to the first node, the respective helper node sends the linearly independent columns of the repair matrix to each of the group of two or more nodes concurrently.


Example 10. The distributed storage system of any of examples 1-9, wherein to detect the failure in the first node, the distributed storage system is configured to detect a failure in each of a group of two or more nodes in the plurality of nodes, wherein the group of two or more nodes includes the first node, and wherein to send the linearly independent columns of the repair matrix to the first node, the respective helper node sends the linearly independent columns of the repair matrix to each of the group of two or more nodes concurrently.


Example 11. The distributed storage system of any of examples 1-10, wherein an achievable trade-off of the distributed storage system is independent of n.


Example 12. The distributed storage system of any of examples 1-11, wherein to multiply the content matrix for the respective helper node by the repair-encoder matrix to obtain the repair matrix, the respective helper node is configured to perform a linear multiplication.


Example 13. The distributed storage system of any of examples 1-12, wherein the determinant code is optimum for d=k.


Example 14. The distributed storage system of any of examples 1-13, wherein a required mathematical field size for the determinant code is linear.


Example 15. The distributed storage system of any of examples 1-14, wherein the linearly independent columns of the repair matrix for each helper node is independent of every other helper node.


Example 16. The distributed storage system of any of examples 1-15, wherein the distributed storage system is further configured to concatenate a plurality of determinant codes to construct a merged determinant code, wherein each of the plurality of determinant codes comprises a d=k determinant code, and wherein the merged determinant code comprises a d>=k determinant code.


Example 17. The distributed storage system of any of examples 1-16, wherein a number of code parameters for the determinant code is less than or equal to (d−k+1)k.


Example 18. A method comprising: upon detecting a failure in a first node of a distributed storage system comprising a plurality of nodes and a number of helper nodes, wherein a total number of nodes in the distributed storage system is represented by n, wherein a file stored in the distributed storage system is recovered from a subset of a number of nodes represented by k upon a file failure on a node in the distributed storage system, and wherein a failed node in the plurality of nodes is recovered from the number of helper nodes of the plurality of nodes represented by d, and for each helper node in the distributed storage system: for a particular mode of a determinant code, the particular mode represented by m, determining, by the respective helper node, a repair-encoder matrix having a number of rows based on a number of m-element subsets of d and having a number of columns based on a number of m−1-element subsets of d; multiplying, by the respective helper node, a content matrix for the respective helper node by the repair-encoder matrix to obtain a repair matrix, wherein a maximum number represented by b columns of the repair matrix are linearly independent, wherein b is based on a number of m−1-element subsets of d−1; extracting, by the respective helper node, each linearly independent column of the repair matrix; and sending, by the respective helper node, the linearly independent columns of the repair matrix to the first node.


Example 19. The method of example 18, wherein the content matrix comprises a product of an encoder matrix and a message matrix.


Example 20. A system comprising: at least one processor; and a computer-readable storage medium storing instructions that are executable by the at least one processor to: detect a failure in a first node of a plurality of nodes in a distributed storage system, wherein a total number of the plurality of nodes in the distributed storage system is represented by n, wherein a file stored in the distributed storage system is recovered from a subset of a number of nodes represented by k upon a file failure on a node in the distributed storage system, and wherein a failed node in the plurality of nodes is recovered from a number of helper nodes of the plurality of nodes represented by d; for a particular mode of a determinant code, the particular mode represented by m, determine, by a helper node, a repair-encoder matrix having a number of rows based on a number of m-element subsets of d and having a number of columns based on a number of m−1-element subsets of d; multiply, by the helper node, a content matrix for the helper node by the repair-encoder matrix to obtain a repair matrix, wherein a maximum number represented by b columns of the repair matrix are linearly independent, wherein b is based on a number of m−1-element subsets of d−1; extract, by the helper node, each linearly independent column of the repair matrix; and send, by the helper node, the linearly independent columns of the repair matrix to the first node.


In one or more examples, the functions described may be implemented in hardware, software, firmware, or any combination thereof. If implemented in software, the functions may be stored on or transmitted over as one or more instructions or code on a computer-readable medium and executed by a hardware-based processing unit. Computer-readable media may include computer-readable storage media, which corresponds to a tangible medium such as data storage media, or communication media including any medium that facilitates transfer of a computer program from one place to another, e.g., according to a communication protocol. In this manner, computer-readable media generally may correspond to (1) tangible computer-readable storage media which is non-transitory or (2) a communication medium such as a signal or carrier wave. Data storage media may be any available media that can be accessed by one or more computers or one or more processors to retrieve instructions, code and/or data structures for implementation of the techniques described in this disclosure. A computer program product may include a computer-readable medium.


By way of example, and not limitation, such computer-readable storage media can comprise RAM, ROM, EEPROM, CD-ROM or other optical disk storage, magnetic disk storage, or other magnetic storage devices, flash memory, or any other medium that can be used to store desired program code in the form of instructions or data structures and that can be accessed by a computer. Also, any connection is properly termed a computer-readable medium. For example, if instructions are transmitted from a website, server, or other remote source using a coaxial cable, fiber optic cable, twisted pair, digital subscriber line (DSL), or wireless technologies such as infrared, radio, and microwave, then the coaxial cable, fiber optic cable, twisted pair, DSL, or wireless technologies such as infrared, radio, and microwave are included in the definition of medium. It may be understood, however, that computer-readable storage media and data storage media do not include connections, carrier waves, signals, or other transitory media, but are instead directed to non-transitory, tangible storage media. Disk and disc, as used herein, includes compact disc (CD), laser disc, optical disc, digital versatile disc (DVD), floppy disk and Blu-ray disc, where disks usually reproduce data magnetically, while discs reproduce data optically with lasers. Combinations of the above should also be included within the scope of computer-readable media.


Instructions may be executed by one or more processors, such as one or more digital signal processors (DSPs), general purpose microprocessors, application specific integrated circuits (ASICs), field programmable logic arrays (FPGAs), or other equivalent integrated or discrete logic circuitry. Accordingly, the term “processor,” as used herein may refer to any of the foregoing structure or any other structure suitable for implementation of the techniques described herein. In addition, in some aspects, the functionality described herein may be provided within dedicated hardware and/or software modules configured for encoding and decoding, or incorporated in a combined codec. Also, the techniques could be fully implemented in one or more circuits or logic elements.


The techniques of this disclosure may be implemented in a wide variety of devices or apparatuses, including a wireless handset, an integrated circuit (IC) or a set of ICs (e.g., a chip set). Various components, modules, or units are described in this disclosure to emphasize functional aspects of devices configured to perform the disclosed techniques, but do not necessarily require realization by different hardware units. Rather, as described above, various units may be combined in a codec hardware unit or provided by a collection of interoperative hardware units, including one or more processors as described above, in conjunction with suitable software and/or firmware.


Various examples of the disclosure have been described. Any combination of the described systems, operations, or functions is contemplated. These and other examples are within the scope of the following claims.

Claims
  • 1. A distributed storage system comprising: a plurality of nodes comprising a first node and a number of helper nodes, wherein a total number of nodes in the distributed storage system is represented by n, wherein a file stored in the distributed storage system is recovered from a subset of a number of nodes represented by k upon a file failure on a node in the distributed storage system, and wherein a failed node in the plurality of nodes is recovered from the number of helper nodes of the plurality of nodes represented by d, wherein upon detecting a failure in the first node, each helper node of the number of helper nodes is configured to:for a particular mode of a determinant code, the particular mode represented by m, determine, by the respective helper node, a repair-encoder matrix having a number of rows based on a number of m-element subsets of d and having a number of columns based on a number of m−1-element subsets of d;multiply, by the respective helper node, a content matrix for the respective helper node by the repair-encoder matrix to obtain a repair matrix, wherein a maximum number represented by b columns of the repair matrix are linearly independent, wherein b is based on a number of m−1-element subsets of d−1;extract, by the respective helper node, each linearly independent column of the repair matrix; andsend, by the respective helper node, the linearly independent columns of the repair matrix to the first node.
  • 2. The distributed storage system of claim 1, wherein the content matrix comprises a product of an encoder matrix and a message matrix.
  • 3. The distributed storage system of claim 2, wherein the encoder matrix is common to each of the helper nodes, and wherein the message matrix is unique for every helper node.
  • 4. The distributed storage system of claim 2, wherein the encoder matrix comprises a fixed matrix having n rows and d columns, and wherein the encoder matrix is maximum-distance-separable.
  • 5. The distributed storage system of claim 2, wherein the message matrix has d rows, and wherein one or more entries of the message matrix comprise one or more of a source symbol or a parity symbol.
  • 6. The distributed storage system of claim 1, wherein the number of rows in the repair-encoder matrix is equal to the number of m-element subsets of d and the number of columns in the repair-encoder matrix is equal to the number of m−1-element subsets of d.
  • 7. The distributed storage system of claim 1, wherein b is equal to a number of m−1-element subsets of d−1.
  • 8. The distributed storage system of claim 1, wherein to send the linearly independent columns of the repair matrix to the first node, the respective helper node is configured to send only the linearly independent columns of the repair matrix to the first node, wherein the linearly independent columns of the repair matrix form a repair space of the first node.
  • 9. The distributed storage system of claim 1, wherein each linearly independent column comprises a repair-data vector.
  • 10. The distributed storage system of claim 1, wherein to detect the failure in the first node, the distributed storage system is configured to detect a failure in each of a group of two or more nodes in the plurality of nodes, wherein the group of two or more nodes includes the first node, andwherein to send the linearly independent columns of the repair matrix to the first node, the respective helper node sends the linearly independent columns of the repair matrix to each of the group of two or more nodes concurrently.
  • 11. The distributed storage system of claim 1, wherein an achievable trade-off of the distributed storage system is independent of n.
  • 12. The distributed storage system of claim 1, wherein to multiply the content matrix for the respective helper node by the repair-encoder matrix to obtain the repair matrix, the respective helper node is configured to perform a linear multiplication.
  • 13. The distributed storage system of claim 1, wherein the determinant code is optimum for d=k.
  • 14. The distributed storage system of claim 1, wherein a required mathematical field size for the determinant code is linear.
  • 15. The distributed storage system of claim 1, wherein the linearly independent columns of the repair matrix for each helper node is independent of every other helper node.
  • 16. The distributed storage system of claim 1, wherein the distributed storage system is further configured to: concatenate a plurality of determinant codes to construct a merged determinant code, wherein each of the plurality of determinant codes comprises a d=k determinant code, and wherein the merged determinant code comprises a d>=k determinant code.
  • 17. The distributed storage system of claim 1, wherein a number of code parameters for the determinant code is less than or equal to (d−k+1)k.
  • 18. A method comprising: upon detecting a failure in a first node of a distributed storage system comprising a plurality of nodes and a number of helper nodes, wherein a total number of nodes in the distributed storage system is represented by n, wherein a file stored in the distributed storage system is recovered from a subset of a number of nodes represented by k upon a file failure on a node in the distributed storage system, and wherein a failed node in the plurality of nodes is recovered from the number of helper nodes of the plurality of nodes represented by d, and for each helper node in the distributed storage system: for a particular mode of a determinant code, the particular mode represented by m, determining, by the respective helper node, a repair-encoder matrix having a number of rows based on a number of m-element subsets of d and having a number of columns based on a number of m−1-element subsets of d;multiplying, by the respective helper node, a content matrix for the respective helper node by the repair-encoder matrix to obtain a repair matrix, wherein a maximum number represented by b columns of the repair matrix are linearly independent, wherein b is based on a number of m−1-element subsets of d−1;extracting, by the respective helper node, each linearly independent column of the repair matrix; andsending, by the respective helper node, the linearly independent columns of the repair matrix to the first node.
  • 19. The method of claim 18, wherein the content matrix comprises a product of an encoder matrix and a message matrix.
  • 20. A system comprising: at least one processor; anda computer-readable storage medium storing instructions that are executable by the at least one processor to: detect a failure in a first node of a plurality of nodes in a distributed storage system, wherein a total number of the plurality of nodes in the distributed storage system is represented by n, wherein a file stored in the distributed storage system is recovered from a subset of a number of nodes represented by k upon a file failure on a node in the distributed storage system, and wherein a failed node in the plurality of nodes is recovered from a number of helper nodes of the plurality of nodes represented by d;for a particular mode of a determinant code, the particular mode represented by m, determine, by a helper node, a repair-encoder matrix having a number of rows based on a number of m-element subsets of d and having a number of columns based on a number of m−1-element subsets of d;multiply, by the helper node, a content matrix for the helper node by the repair-encoder matrix to obtain a repair matrix, wherein a maximum number represented by b columns of the repair matrix are linearly independent, wherein b is based on a number of m−1-element subsets of d−1;extract, by the helper node, each linearly independent column of the repair matrix; andsend, by the helper node, the linearly independent columns of the repair matrix to the first node.
Parent Case Info

This application claims the benefit of U.S. Provisional Patent Application No. 62/863,780, filed on Jun. 19, 2019, the entire content of which is incorporated herein by reference.

GOVERNMENT INTEREST

This invention was made with government support under CCF-1617884 awarded by the National Science Foundation. The government has certain rights in the invention.

US Referenced Citations (11)
Number Name Date Kind
8631269 Vinayak et al. Jan 2014 B2
20110289351 Rashmi Nov 2011 A1
20150142863 Yuen May 2015 A1
20150303949 Jafarkhani Oct 2015 A1
20160006463 Li Jan 2016 A1
20160274972 Li Sep 2016 A1
20170046227 Fan Feb 2017 A1
20170063398 Richardson et al. Mar 2017 A1
20170272100 Yanovsky Sep 2017 A1
20190056868 Cabral Feb 2019 A1
20210271552 Zhang Sep 2021 A1
Foreign Referenced Citations (5)
Number Date Country
105721611 Jun 2016 CN
108923960 Nov 2018 CN
109684127 Apr 2019 CN
109828723 May 2019 CN
110212923 Sep 2019 CN
Non-Patent Literature Citations (43)
Entry
Mehran Elyasi and Soheil Mohajer. 2019. Determinant Codes With Helper-Independent Repair for Single and Multiple Failures. IEEE Trans. Inf. Theor. 65, 9 (Sep. 2019), pp. 5469-5483. (Year: 2019).
M. Elyasi and S. Mohajer, “Exact-repair trade-off for (n, k=d-1, d) regenerating codes,” 2017 55th Annual Allerton Conference on Communication, Control, and Computing (Allerton), 2017, pp. 934-941. (Year: 2017).
Elyasi et al., “Cascade Codes for Distributed Storage Systems,” Cornell University, presented in part at the IEEE International Symposium on Information Theory (ISIT), 2018, accessed from https://arxiv.org/abs/1901.00911, uploaded on Jan. 3, 2019, 60 pp.
Elyasi et al., “Determinant Coding: A Novel Framework for Exact-Repair Regenerating Codes,” IEEE Transactions an Information Theory, vol. 62, No. 12, Dec. 2016, pp. 6683-6697.
Elyasi et al., “A Cascade Code Construction for (n,k,d) Distributed Storage Systems,” 2018 IEEE International Symposium on Information Theory (ISIT), Jun. 17, 2018, 5 pp.
Elyasi et al., “Determinant Codes with Helper-Independent Repair for Single and Multiple Failures,” IEEE Transactions on Information Theory (Version 2), vol. 65, Issue 9, Mar. 7, 2019, 25 pp.
Elyasi et al., “A Cascade Code Construction for (n,k,d) Distributed Storage Systems,” Presentation Slides, Department of ECE University of Minnesota, presented in part at the IEEE International Symposium on Information Theory (ISIT), Jun. 19, 2018, 28 pp.
Dimakis et al., “Network Coding for Distributed Storage Systems,” IEEE Transactions on Information Theory, vol. 56, No. 9, Sep. 2010, pp. 4539-4551.
Ahlswede et al., “Network Information Flow,” IEEE Transactions on Information Theory, vol. 46, No. 4, Jul. 2000, pp. 1204-1216.
Rashmi et al., “Explicit Construction of Optimal Exact Regenerating Codes for Distributed Storage,” Proc. 47th Annual Allerton Conference on Communication, Control, and Computing (Allerton), Sep. 30, 2009, pp. 1243-1249.
Rashmi et al., “Optimal Exact-Regenerating Codes for Distributed Storage at the MSR and MBR Points via a Product-Matrix Construction,” IEEE Transactions on Information Theory, vol. 57, No. 8, Aug. 2011, pp. 5227-5237.
Cullina et al., “Searching for Minimum Storage Regenerating Codes,” Cornell University, accessed from https://arxiv.org/abs/0910.2245, Oct. 12, 2009, 10 pp.
Shah et al., “Interference Alignment in Regenerating Codes for Distributed Storage: Necessity and Code Constructions,” IEEE Transactions on Information Theory, vol. 52, No. 4, Apr. 2012, pp. 2134-2158.
Suh et al., “Exact-Repair MDS Codes for Distributed Storage Using Interference Alignment,” ResearchGate, IEEE Xplore, 2010 IEEE International Symposium on Information Theory Proceedings (ISIT), Jun. 13-18, 2010, 5 pp.
Lin et al., “A Unified Form of Exact-MSR Codes via Product-Matrix Frameworks,” IEEE Transactions on Information Theory, vol. 61, No. 2, Feb. 2015, pp. 873-886.
Cadambe et al., “Distributed Data Storage with Minimum Storage Regenerating Codes—Exact and Functional Repair are Asympotically Equally Efficient,” Cornell University, accessed from https://arxiv.org/abs/1004.4299, submitted Apr. 24, 2010, 11 pp.
Suh et al., “On the Existence of Optimal Exact-Repair MDS Codes for Distributed Storage,” Cornell University, accessed from https://arxiv.org/abs/1004.4663, submitted Apr. 26, 2010, 20 pp.
Cadambe et al., “Asymptotic Interference Alignment for Optimal Repair of MDS codes in Distributed Data Storage,” IEEE Transactions on Information Theory, vol. 59, No. 5, pp. 2974-2987.
Durrsma, “Outer bounds for exact repair codes,” Cornell University, accessed from https://arxiv.org/abs/1406.4852, submitted Jun. 18, 2014, 14 pp.
Mohajer et al., “New Bounds on the (n,k,d) Storage Systems with Exact Repair,” In Proceedings of the 2015 IEEE International Symposium on Information Theory (ISIT), Jun. 2015, pp. 2056-2060.
Tian, “A Note on the Rate Region of Exact-Repair Regenerating Codes,” Cornell University, accessed from https://arxiv.org/abs/1503.00011, submitted Feb. 27, 2015, 7 pp.
Duursma, “Shortened regenerating codes,” Cornell University, accessed from https://arxiv.org/abs/1505.00178, submitted May 1, 2015, 11 pp.
Goparaju et al., “New Codes and Inner Bounds for Exact Repair in Distributed Storage Systems,” Cornell University, accessed from https://arxiv.org/pdf/1402.2343 pdf, submitted Feb. 11, 2014, 8 pp.
Rashmi et al., Having Your Cake and Eating It Too: Jointly Optimal Erasure Codes for I/O. Storage, and Network Bandwidth, USENIX The Advanced Computing Systems Association, Proceedings of the 13th USENIX Conference on File and Storage Technologies (FAST '15), Feb. 16-19, 2015, 15 pp.
Elaysi et al., “Linear Exact Repair Rate Region of (k+1, k, k) Distributed Storage Systems: A New Approach,” IEEE, 2015 IEEE International Symposium on Information Theory (ISIT), Jun. 14-19, 2015, 5 pp.
Elaysi et al., “A Probabilistic Approach Towards Exact-Repair Regeneration Codes,” IEEE, 2015 53rd Annual Allerton Conference on Communication, Control, and Computing (Allerton), Sep. 29-Oct. 2, 2015, 9 pp.
Gerami et al., “Repair for Distributed Storage Systems with Erasure Channels,” IEEE, IEEE Transactions on Communications, vol. 64, Issue 4, Apr. 2016, 6 pp.
Gerami et al., “Optimized-Cost Repair in Multi-hop Distributed Storage Systems in the Network Coding,” Cornell University, accessed from https://arxiv.org/abs/1303.6046, submitted Mar. 25, 2013, 13 pp.
Mahdaviani “Robustness and Flexibility in Coding for Distributed Storage Systems,” Thesis, University of Toronto, Department of Electrical and Computer Engineering, Nov. 2018, 166 pp.
Hanxu, Low-Complexity Codes for Distributed Storage Systems, Thesis, Chinese University of Hong Kong, Jun. 2015, 129 pp.
Kleckler et al., “Secure Determinant Codes: A Class of Secure Exact-Repair Regenerating Codes,” IEEE. 2019 IEEE International Symposium on Information Theory (ISIT), Jul. 7-12, 2019, 5 pp.
Elaysi et al., “Determinant Codes with Helper_Independent Repair for Single and Multiple Failures,” Version 1, Cornell University, accessed from https://arxiv.org/abs/1812.01142v1, Dec. 4, 2018, 25 pp.
Elyasi et al., “New Exact-Repair Codes for Distributed Storage Systems Using Matrix Determinant,” 2016 IEEE International Symposium on Information Theory (ISIT), Jul. 2016, pp. 1212-1216.
Elaysi et al., “Linear Exact Repair Rate Region of (k+1, k, k) Distributed Storage Systems: A New Approach,” Proceedings in IEEE International Symposium on Information Theory (ISIT), Jun. 2015, pp. 2061-2065.
Ernvall, “Exact-Regenerating Codes between MBR and MSR Points,” Proceedings in IEEE International Symposium on Information Theory (ISIT), Sep. 2013, pp. 1-5.
Sasidharan et al., “An Improved OuterBound on the Storage-Repair-Bandwidth Tradeoff of Exact-Repair Regenerating Codes,” 2014 IEEE International Symposium on Information Theory (ISIT), Jun. 2014, pp. 2430-2434.
Mohajer et al., “Exact Repair for Distributed Storage Systems: Partial Characterization via New Bounds,” Proceedings of Information Theory and Applications Workshop (ITA), San Diego, California, Feb. 2015, pp. 130-135.
Prakash et al., “The Storage-Repair-Bandwidth Trade-off of Exact Repair Linear Regenerating Codes for the Case d=k=n-1,” Proceedings in IEEE International Symposium on Information Theory (ISIT), Jun. 2015, pp. 859-863.
Tian et al., “Rate Region of the (4, 3, 3) Exact-Repair Regenerating Codes,” 2013 IEEE International Symposium on Information Theory, Proceedings in IEEE International Symposium on Information Theory (ISIT), Jul. 2013, pp. 1426-1430.
Tian et al., “Layered Exact-Repair Regenerating Codes via Embedded Error Correction and Block Designs,” IEEE Transactions on Information Theory, vol. 61, No. 4, Apr. 2015, pp. 1933-1947.
Wu, “A Construction of Systematic MDS Codes With Minimum Repair Bandwidth,” IEEE Transactions on Information Theory, vol. 57, No. 6, Jun. 2011, pp. 3738-3741.
Bocher et al., “Introduction to Higher Algebra,” The Macmillan Company, New York, NY, USA, 332 pp. (Applicant points out, in accordance with MPEP 609.04(a), that the year of publication, 1907, is sufficiently earlier than the effective U.S. filing date, so that the particular month of publication is not in issue.).
Cullis, “Matrices and Determinoids, vol. 2,” [Abstract only] Cambridge University Press, Cambridge, U.K, retrieved from bookdepository.com/Matrices-Determinoids-2-C-E-Cullis/9781107620834, 3 pp.
Related Publications (1)
Number Date Country
20200409793 A1 Dec 2020 US
Provisional Applications (1)
Number Date Country
62863780 Jun 2019 US