Typically, data centers employ various fault-tolerant data storage techniques in an attempt to provide efficient and reliable storage of large quantities of data. Conventional approaches involve added storage overhead in order to store replicated data and/or redundant data; each of which translates into high operating costs.
Implementations described herein provide for fault-tolerant storage of data using erasure codes. Maximally recoverable cloud codes, resilient cloud codes, and robust product codes are examples of different erasure codes that can be implemented to encode and store data. Implementing different erasure codes and different parameters within each erasure code can involve trade-offs between reliability, redundancy, and locality. In some examples, an erasure code can specify placement of the encoded data on machines that are organized into racks.
This Summary is provided to introduce a selection of concepts in a simplified form that are further described below in the Detailed Description. This Summary is not intended to identify key or essential features of the claimed subject matter; nor is it to be used for determining or limiting the scope of the claimed subject matter.
The detailed description is set forth with reference to the accompanying drawing figures. In the figures, the left-most digit(s) of a reference number identifies the figure in which the reference number first appears. The use of the same reference numbers in different figures indicates similar or identical items or features.
Overview
The technologies described herein are generally directed toward fault-tolerant storage of data. Data can be stored across multiple storage devices (servers, disks, etc.), which are often referred to as “machines.” Machines can be arranged into a row, as a rack. Racks, in turn, can be arranged into multiple rows, resulting in a “grid” of machines that includes multiple rows and multiple columns of machines.
Three concepts associated with fault-tolerant data storage techniques are: reliability, redundancy, and locality. Reliability is associated with types of failures and quantities of failures that can be tolerated by a data storage system. Thus, reliability is based upon the ability to reconstruct data after one or more machines fails or becomes unavailable. Redundancy is associated with how much redundant data is stored by the data storage system. Smaller amounts of redundant data may be more desirable than larger amounts, since smaller amounts of redundant data use fewer resources. Locality is associated with how many machines are required to recover data after a machine fails. Thus, a lower locality can indicate a smaller amount of time for tasks, such as disk I/O and/or network transfer, is required to recover data. Different data storage techniques can involve different trade-offs between reliability, redundancy, and locality.
An erasure “code” can be implemented to store the data. The erasure code encodes the data using a particular type of code. The erasure code can also specify placement of the encoded data on machines and racks. As described herein, “Maximally recoverable cloud codes,” “resilient cloud codes” and “robust product codes,” are different types of erasure codes that can be implemented individually, in pairs, and/or simultaneously.
According to some implementations, the encoded data is stored in s racks, each of which includes m machines. The failure of one or more individual machines is a common failure mode. To recover data that was stored on the failed machine, it may be desirable to access a relatively small number of other machines. The number of machines accessed to recover from a single failure is the locality, r, of the code. In some implementations, d−1 total machine failures can be tolerated. Thus, the data can be recovered, even after d−1 machine failures. The failure of an entire rack of machines may be considered a catastrophic failure mode. The failure of an entire rack can be due to power failure, fire, etc. In this case, it may be desirable to ensure that no data is lost, even when d′ additional machine failures occur before the lost rack is back online. Thus, d′ denotes the number of machines failures that can be tolerated after the loss of a rack.
It is desirable to have a scheme with the above reliability guarantees that is maximally efficient with respect to storage overhead. Since higher storage overhead can translate into higher operating costs, once the encoder module 110 fixes the parameters s, m, r, d and d′, it is desirable to maximize the amount of data that can be stored, while retaining the above reliability guarantees. In some implementations, one or more of the parameters s, m, r, d and d′ are fixed.
Recovering from machine failures alone requires a certain amount of overhead as a function of s, m, r and d. Codes that meet this bound exactly may be referred to as “cloud codes.” As described herein, a code that is additionally tolerant to rack failures may be referred to as a “resilient cloud code.” Some implementations herein provide for a construction of resilient cloud codes for a range of values for the parameters s, m, r, d and d′.
Example Environment
In the illustrated example, the environment 100 includes one or more servers 104, which can each include one or more processors 106 and computer readable media 108. The processor 106 and the computer readable media 108 are described in further detail below.
The environment 100 can include various modules and functional components for performing the functions described herein. In some implementations, the environment 100 can include an encoder module 110 for encoding and storing data 112 in the environment 100, such as encoding and storing data 112 in the data store 102. Furthermore, a recovery module 114 can reconstruct a portion of the data 112 from the data store 102 in response to one or more failures of machines. For example, recovery module 114 can reconstruct a portion of the data 112 in response to one or more machines in data store 102 failing or becoming unavailable. In some examples, the functions performed by the encoder module 110 and the recovery module 114, along with other functions, can be performed by one module. Additional aspects of the encoder module 110 and the recovery module 114 are discussed below. Moreover, the environment 100 can be interconnected to one or more clients 116 and one or more other data centers via a network 118, such as the Internet, in order to access data 112. Furthermore, the one or more servers 104 can be interconnected via an intranet infrastructure (e.g., a local area network 120) within the environment 100.
Maximally Recoverable Cloud Codes
In some implementations, a “maximally recoverable cloud code” provides an error correction guarantee for data failures, called maximum recoverability. Maximum recoverability as used herein means that given all other constraints (e.g., optimal space overhead, locality, and the ability to correct any d−1 simultaneous failures) the code corrects the largest possible number of failure patterns of sizes above d−1. As shown below, by choosing coefficients using algebraic techniques, maximally recoverable cloud codes can be constructed with locality r and distance d for every setting of r and d.
In some implementations, a family of codes can be constructed that encode k data symbols, where k is an integer, into n=k+k/r+d−2 symbols such that the resulting code has locality r for every data symbol and distance d. Data symbols can be partitioned into k/r=g data groups G1 through Gg. A group parity symbol can be stored for each of these groups as an additional element or member of each group. Additionally, d−2 “global” parity symbols can be stored. As used herein, “d−2” will be represented as “y.” In some embodiments, each of the global parity symbols depends on all k data symbols. The global parity symbols form the last group Gg+1/(a parity group). The resulting groups are:
where G denotes a group, X denotes a data symbol, and Y denotes a parity symbol.
In addition to correcting any d−1 failures, such as d−1 simultaneous or contemporaneous failures, a maximally recoverable cloud code can correct the largest possible number of failure patterns of sizes d and higher. In some implementations, the maximally recoverable cloud code corrects failure patterns that comprise a single failure in each data group (e.g., a failure in a data symbol or group parity symbol) and y additional arbitrary failures. In some implementations, recovering from the failures can comprise performing an exclusive-or of a group parity symbol of a group against surviving data symbols of the group to recover a failed data symbol of the group.
The following describes how global parities can be obtained for maximally recoverable cloud codes, given the data symbols. Data and parity symbols {Xi,j} and {Yi} are treated as elements of a certain finite Galois field F2
where X denotes a data symbol, Y denotes a global parity symbol, ω denotes a proper element, λ denotes an element as described above, x=r denotes the number of data symbols per group, y denotes the number of global parities, and g denotes the number of groups.
It can be verified that the equations above guarantee maximal recoverability. The equations above also allow one to obtain explicit maximally recoverable cloud codes over small finite fields. For example, for k=60, r=4, d=6, one can set a to “4,” b to “4,” and set {λi}εF2
In some implementations, recovering from failures involves solving y+g equations. The first g equations are, with j=1 to g:
where x denotes the number of data symbols per group, X denotes a data symbol, and Z denotes a group parity symbol, and the next y equations are:
Σj=1gΣi=1x(ωiλj)2
where x denotes the number of data symbols per group, X denotes a data symbol, Y denotes a global parity symbol, ω denotes a proper element, λ denotes an element as described above, y denotes the number of global parities, and g denotes the number of groups. In some implementations, the groups parity symbol for a group is created or generated by performing an exclusive-or of the data symbols of the group.
Resilient Cloud Codes
In some implementations, a “resilient cloud code” is similar to the maximally recoverable cloud codes above, but provides additional fault tolerance by using additional parities and specifying a placement strategy for data symbols and parity symbols. In some implementations, k data symbols of the same size or approximately the same size are stored in machines across s racks. In some implementations, the resilient cloud code is based on the maximally recoverable cloud code described above, with r=s−1 and d=r+2=s+1. Data symbols are arranged into g=k/r groups G1 through Gg, and for each group, a parity of the group is also stored with the group. There are also d−2=y=r global parity symbols that form the last group Gg+1. Each global parity symbol depends on all k data symbols. In some embodiments, each global parity symbol depends on one or more of the k data symbols. For resilient cloud codes, the parity of all of the global parities is also stored, which increases by one the number of simultaneous failures that can be tolerated. Furthermore, in some embodiments, the data symbols and parity symbols are stored in rows and columns of the machines. Each data symbol and parity symbol are stored on different machines (e.g. different failure domains).
In some implementations, global parities can be obtained in a similar way as for maximally recoverable cloud codes, described above. For pε{0, . . . , r−1}, the p-th equation is:
where X denotes a data symbol, Y denotes a global parity symbol, w denotes a proper element as described above, λ denotes an element as described above, x=r denotes the number of data symbols per group, and g denotes the number of data groups. In some implementations, x=r denotes the number of the columns. In some implementations, g denotes the number of rows.
In some implementations, recovering from failures involves solving g+x+1 equations. The first g equations are, with j=1 to g:
where x denotes the number of columns, X denotes a data symbol and Z denotes a group parity symbol, where a group parity symbol for a group can be obtained by performing an exclusive-or of the data symbols for the group, and the g+1 equation is:
where Y denotes a global parity symbol, Zg+1 is the parity of the global parity symbols, which can be obtained by performing an exclusive-or of the global parity symbols, x denotes the number of columns, and g denotes the number of rows, and the next x equations with P=0 . . . x−1 are:
Σj=1gΣi=1x(ωiλj)2
where X denotes a data symbol, Y denotes a global parity symbol, w denotes a proper element, λ denotes an element as described above, x denotes the number of columns, and g denotes the number of rows. In some implementations, the groups parity symbol for a group is created or generated by performing an exclusive-or of the data symbols of the group.
In some implementations, a placement strategy is also followed. The placement strategy specifies how to optimally place data symbols and parity symbols across machines and racks. In some implementations, the placement strategy should satisfy the following two constraints: 1) no two symbols reside on the same machine; and 2) for each group Gi, iε{1, . . . , g+1} the r symbols in the group and the group parity should reside on different racks.
The choice of coefficients as discussed above for the maximally recoverable cloud codes, as well as the above placement strategy, yields explicit codes over small fields that exhibit optimal tradeoffs between locality, reliability, and redundancy, even in the scenario of losing an entire rack. In particular, the following three guarantees are obtained: 1) if any one machine is unavailable, the data on it can be reconstructed by accessing r other machines; 2) the code tolerates any d simultaneous machine failures; and 3) the code tolerates any y simultaneous machine failures after losing an entire rack.
In some implementations, encoder module 110 generates a number of global parity symbols that is equal to the number of the columns, in order to form the global parity group. In some implementations, encoder module 110 generates a group parity symbol by performing an exclusive-or of the data symbols of the corresponding data group. In some implementations, encoder module 110 generates the parity of all of the global parities by performing an exclusive-or of the global parity symbols.
In some implementations, resilient cloud codes allow recovery from failures that comprise up to one failure in each the data groups and in the global parity group. Furthermore, recovering from failures can comprise performing an exclusive-or of a group parity symbol of a data group against surviving data symbols of the data group to recover a failed data symbol of the group. In some implementations, recovering from failures may comprise performing an exclusive-or of the second global parity symbol against surviving first global parity symbols of the global parity group to recover a failed first global parity symbol of the global parity group.
Robust Product Codes
Product encoding is a type of erasure encoding that can provide good locality and low redundancy. For a basic product code, data symbols (also referred to a “data chunks”) from a single stripe are arranged in an A-by-B grid of data symbols. A stripe can be a sequence of data, such as a file, that is logically segmented such that consecutive segments are stored on different physical storage devices. R parity symbols (also referred to a “parity chunks”) are generated for every row using a predefined erasure code (e.g., a Reed Solomon code or other erasure code). A parity symbol is generated for every column. (A+1)*(B+R) symbols are distributed across (A+1)*(B+R) different machines. Data symbols and parity symbols can have internal checksums to detect corruption, so upon decoding, bad symbols can be identified. A missing data or parity symbol can be recovered from the symbol in the same row or the symbol in the same column, assuming a sufficient number of the symbols are available. Different values of A, B and R can provide tradeoffs between reliability, availability, and space overhead.
In some implementations, a “robust product code” provides a more reliable product code by using different codes to encode different rows. With an appropriate choice of row-codes and column-codes, robust product codes correct all patterns of failures that are correctable by basic product codes, and many more.
A data storage system that uses a robust product code partitions stripes into chunks of data of roughly equal size (data symbols), encodes each stripe separately using the robust product code, and stores the data symbols of a single stripe in different failure domains (e.g. different machines, storage devices, etc.) to ensure independence of failures. Thus, each data symbol is stored on a different failure domain than the each of the other data symbols.
At the encoding stage, data symbols {Xij} are arranged in an A-by-B grid, where A and B are fixed integers). The data symbols can be considered elements of a finite Galois field. Then, R parity symbols are generated for every row. Each row parity symbol is a linear combination of the corresponding data symbols with coefficients. Thus, the j-th parity symbol for the i-th row is given by Pi,(B+j)=Σs=1Bαi,j,sXi,s. Different coefficients {αi,j,s} are used to define parities in different rows. Finally, one parity chunk is generated for each of (B+R) columns. Each column parity chunk is a simple sum of the corresponding column chunks (See
Properties of the robust product code are governed by the choice of coefficients {αi,j,s} used to define the parity symbols. If one chooses these coefficients generically from a sufficiently large finite field, the following properties hold: 1) the robust product code corrects all patterns of simultaneous failures that are correctable by the basic product code with the same parameters (A, B, and R); 2) unlike the basic product code, the robust product code corrects all 2-by-(R+1) patterns of failures that do not involve the parity row; and 3) the robust product code corrects many other patterns of simultaneous failures that are uncorrectable by the basic product code.
There are many ways to fix coefficients {αi,j,s} explicitly in a small field to get a subset of these three properties. The following is an example of choosing coefficients, assuming R=2. Let F be a finite field and size 2A+B or more and characteristic two. Let {αj}1≦j≦B+2, {bi}1≦i≦A and {ci}1≦i≦A be some arbitrary 2A+B+2 distinct elements of F. As in
Every column j≦B+2 is protected by a single parity symbol XA+1,j=Σi=1AXi,j. The code specified above has the following four properties: 1) each column j≦B+2 corrects a single erasure; 2) each row i≦A corrects up to two erasures (this does not apply to the last row); 3) every 2×4 pattern of erasures that does not involve the bottom row is correctable; and 4) every 2×3 pattern of erasures that does not involve the bottom row is correctable. Thus, in some implementations, the robust product code corrects a failure pattern with a width of up to a number of the row parity symbols per row plus one and a height of two. In some implementations, the failure pattern does not involve the column parity row (e.g., the bottom row in the above example). The improvement in reliability provided by robust product codes can be particularly important when the number of failure domains in a data center cluster is smaller than the number of chunks per stripe, where a failure domain is a group of machines, since losing multiple chunks per stripe becomes much more probable.
In some implementations, R>2. Thus, in some implementations, generating row parity symbols comprises obtaining coefficients of the row parity symbols by solving A times R equations, with i from 1 to A and k from 1 to R, where the equations are:
wherein Xi,j is a data symbol, A is the number of rows, B is the number of rows, R is a number of parities per row, {aj}1≦j≦B+R is a set of distinct elements on Galois Field F, {bi,k}1≦i≦A,1≦k≦R is another set of distinct elements on the Galois Field F, and aj and bi,k are distinct so that no two elements are equal.
Example Processes
In the following flow diagrams, each block represents one or more operations that can be implemented in hardware, software, or a combination thereof. In the context of software, the blocks represent computer-executable instructions that, when executed by one or more processors, cause the processors to perform the recited operations. Generally, computer-executable instructions include routines, programs, objects, components, data structures, and the like that perform particular functions or implement particular abstract data types. The order in which the blocks are described is not intended to be construed as a limitation, and any number of the described operations can be combined in any order and/or in parallel to implement the process. While several examples are described herein for explanation purposes, the disclosure is not limited to the specific examples, and can be extended to additional devices, environments, applications and settings. For discussion purposes, the processes below are described with reference to the environment 100 of
At step 606, the encoder module 110 determines if there is another data group to process. If the encoder module 110 determines that there is another data group to process, then at step 608, the encoder module 110 generates a group parity symbol. At step 610, the encoder module 110 includes the group parity symbol in the data group. The process then returns to step 606. At step 606, if the encoder module 110 determines that there is not another data group to process, then at step 612, the encoder module 110 stores each of the data groups in a column of the first plurality of machines. In some embodiments, no two symbols are stored on a same machine.
At step 614, the recovery module 114 generates one or more first global parity symbols. In some embodiments, each first global parity symbol is based on the data symbols. In some embodiments, each first global parity symbol is based on all of the data symbols. In other embodiments, each first global parity symbol is based on one or more of the data symbols. At step 616, the recovery module 114 generates a second global parity symbol. In some embodiments, the second global parity symbol is based on all of the first global parity symbols. In other embodiments, the second global parity symbol is based on one or more of the first global parity symbols. At step 616, the recovery module 114 stores the first global parity symbols and the second global parity symbol in a second plurality of machines. In some embodiments, each of the second plurality of machines belongs to one of the plurality of racks. In some embodiments, no two symbols are stored on a same machine.
At step 708, for each row of the grid, the encoder module 110 generates one or more row parity symbols. In some embodiments, each row parity symbol is generated using a code that is different than each code used to generate row parity symbols for other rows of the grid. In some embodiments, each row parity symbol is based on data symbols of the corresponding row. In some embodiments, each row parity symbol is based on all data symbols of the corresponding row. In other embodiments, each row parity symbol is based on one or more data symbols of the corresponding row. At step 710, the encoder module 110 associates each row parity symbol with each corresponding row to form one or more row parity columns of the grid that comprise each row parity symbol. At step 712, for each column of the grid, the encoder module 110 generates a column parity symbol. In some embodiments, the column parity symbol is based on data symbols of the column. At step 714, the encoder module 110 associates each column parity symbol with each corresponding column to form a column parity row of the grid that comprises each column parity symbol. In some implementations, the recovery module 114 corrects at least a number of failures that is at least equal to a number of the global parity symbols.
Example Computing System
In the illustrated example, the computing device 800 includes one or more processors 106, one or more computer-readable media 108 that includes the encoder module 110 and the recovery module 114, one or more input devices 802, one or more output devices 804, storage 806 and one or more communication connections 808, all able to communicate through a system bus 810 or other suitable connection.
In some implementations, the processor 106 is a microprocessing unit (MPU), a central processing unit (CPU), or other processing unit or component known in the art. Among other capabilities, the processor 106 can be configured to fetch and execute computer-readable processor-accessible instructions stored in the computer-readable media 108 or other computer-readable storage media. Communication connections 808 allow the device to communicate with other computing devices, such as over a network 108. These networks can include wired networks as well as wireless networks.
As used herein, “computer-readable media” includes computer storage media and communication media. Computer storage media includes volatile and non-volatile, removable and non-removable media implemented in any method or technology for storage of information, such as computer-readable instructions, data structures, program modules, or other data. Computer storage media includes, but is not limited to, random access memory (RAM), read only memory (ROM), electrically erasable programmable ROM (EEPROM), flash memory or other memory technology, compact disk ROM (CD-ROM), digital versatile disks (DVD) or other optical storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium that can be used to store information for access by a computing device.
In contrast, communication media can embody computer-readable instructions, data structures, program modules, or other data in a modulated data signal, such as a carrier wave. As defined herein, computer storage media does not include communication media.
Computer-readable media 108 can include various modules and functional components for enabling the computing device 800 to perform the functions described herein. In some implementations, computer-readable media 108 can include the encoder module 110 for performing erasure coded data storage and operations related to erasure coded data storage. For example, the encoder module 110 can perform erasure coded data storage for data store 102 using a maximally recoverable cloud code, a resilient cloud code or a robust product code. In response to the detecting a failure of one or more machines in the data store 102, the recovery module 114 can reconstruct data that resided on the failed one or more machines in the data store 102. The encoder module 110 and/or the recovery module 114 can include a plurality of processor-executable instructions, which can comprise a single module of instructions or which can be divided into any number of modules of instructions. Such instructions can further include, for example, drivers for hardware components of the computing device 100.
The encoder module 110 and/or the recovery module 114 can be entirely or partially implemented on the computing device 800. Although illustrated in
Computer-readable media 108 or other machine-readable storage media stores one or more sets of instructions (e.g., software) embodying any one or more of the methodologies or functions described herein. The instructions can also reside, completely or at least partially, within the computer-readable media 108 and within processor 106 during execution thereof by the computing device 800. The program code can be stored in one or more computer-readable memory devices or other computer-readable storage devices, such as computer-readable media 108. Further, while an example device configuration and architecture has been described, other implementations are not limited to the particular configuration and architecture described herein. Thus, this disclosure can extend to other implementations, as would be known or as would become known to those skilled in the art.
The example environments, systems and computing devices described herein are merely examples suitable for some implementations and are not intended to suggest any limitation as to the scope of use or functionality of the environments, architectures and frameworks that can implement the processes, components and features described herein. Thus, implementations herein are operational with numerous environments or architectures, and can be implemented in general purpose and special-purpose computing systems, or other devices having processing capability. Generally, any of the functions described with reference to the figures can be implemented using software, hardware (e.g., fixed logic circuitry) or a combination of these implementations. Thus, the processes, components and modules described herein can be implemented by a computer program product.
Furthermore, this disclosure provides various example implementations, as described and as illustrated in the drawings. However, this disclosure is not limited to the implementations described and illustrated herein, but can extend to other implementations, as would be known or as would become known to those skilled in the art. Reference in the specification to “one example” “some examples,” “some implementations,” or similar phrases means that a particular feature, structure, or characteristic described is included in at least one implementation, and the appearances of these phrases in various places in the specification are not necessarily all referring to the same implementation.
Although the subject matter has been described in language specific to structural features and/or methodological acts, the subject matter defined in the appended claims is not limited to the specific features or acts described above. Rather, the specific features and acts described above are disclosed as example forms of implementing the claims. This disclosure is intended to cover any and all adaptations or variations of the disclosed implementations, and the following claims should not be construed to be limited to the specific implementations disclosed in the specification. Instead, the scope of this document is to be determined entirely by the following claims, along with the full range of equivalents to which such claims are entitled.
Number | Name | Date | Kind |
---|---|---|---|
7165059 | Shah et al. | Jan 2007 | B1 |
7206987 | Roth et al. | Apr 2007 | B2 |
7529785 | Spertus et al. | May 2009 | B1 |
7546342 | Li et al. | Jun 2009 | B2 |
7681105 | Sim-Tang et al. | Mar 2010 | B1 |
7685109 | Ransil et al. | Mar 2010 | B1 |
7693877 | Zasman | Apr 2010 | B1 |
7783600 | Spertus et al. | Aug 2010 | B1 |
7904789 | Rozas | Mar 2011 | B1 |
7930611 | Huang et al. | Apr 2011 | B2 |
7966293 | Owara et al. | Jun 2011 | B1 |
8166369 | Chuang et al. | Apr 2012 | B2 |
8468416 | Gara et al. | Jun 2013 | B2 |
20030056139 | Murray et al. | Mar 2003 | A1 |
20040088331 | Therrien et al. | May 2004 | A1 |
20040193659 | Carlson et al. | Sep 2004 | A1 |
20050226143 | Barnhart et al. | Oct 2005 | A1 |
20060064416 | Sim-Tang | Mar 2006 | A1 |
20060161635 | Lamkin et al. | Jul 2006 | A1 |
20060230076 | Gounares et al. | Oct 2006 | A1 |
20060271530 | Bauer | Nov 2006 | A1 |
20070100913 | Sumner et al. | May 2007 | A1 |
20070177739 | Ganguly et al. | Aug 2007 | A1 |
20070208748 | Li | Sep 2007 | A1 |
20070214314 | Reuter | Sep 2007 | A1 |
20080010513 | Devarakonda et al. | Jan 2008 | A1 |
20080052328 | Widhelm et al. | Feb 2008 | A1 |
20080198752 | Fan et al. | Aug 2008 | A1 |
20080221856 | Dubnicki et al. | Sep 2008 | A1 |
20080228687 | Devarakonda et al. | Sep 2008 | A1 |
20080313241 | Li et al. | Dec 2008 | A1 |
20090144224 | Phan et al. | Jun 2009 | A1 |
20090164533 | Hubbard | Jun 2009 | A1 |
20090193064 | Chen et al. | Jul 2009 | A1 |
20090265360 | Bachwani et al. | Oct 2009 | A1 |
20100192018 | Aiyer et al. | Jul 2010 | A1 |
20100199035 | Matsuo et al. | Aug 2010 | A1 |
20100218037 | Swartz et al. | Aug 2010 | A1 |
20100251002 | Sivasubramanian et al. | Sep 2010 | A1 |
20100274827 | Hix et al. | Oct 2010 | A1 |
20100293440 | Thatcher et al. | Nov 2010 | A1 |
20110029840 | Ozzie et al. | Feb 2011 | A1 |
20110035548 | Kimmel et al. | Feb 2011 | A1 |
20120060072 | Simitci et al. | Mar 2012 | A1 |
20120192037 | Gibson et al. | Jul 2012 | A1 |
20120246547 | Yekhanin et al. | Sep 2012 | A1 |
20130054549 | Gopalan et al. | Feb 2013 | A1 |
Entry |
---|
Baker et al., “Measurements of a Distributed File System”, ACM, 1991, pp. 198-pp. 212. |
Cooley, et al., “Software-Based Erasure Codes for Scalable Distributed Storage”, Proceedings of the 20th IEEE/11th NASA Goddard Conference on Mass Storage Systems and Technologies (MSS 2003), retrieved from the internet at <<http://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=01194852>>. |
Dilley et al., “Globally Distributed Content Delivery”, IEEE Internet Computing, 2002, 9 pages. |
“Koh-i-Noor”, retrieved from the internet on Aug. 19, 2009 at <<http://research.microsoft.com/en-us/projects/kohinoor/>>. |
Li, et al., “Erasure Resilient Codes in Peer-to-Peer Storage Cloud”, ICASSP 2006 IEEE, retrieved from the internet at <<http://ieeexplore.ieee.org/xpl/freeabs—all.jsp?tp=&arnumber=1660948&isnumber=34760>>. |
Office action for U.S. Appl. No. 12/534,024, mailed on Aug. 2, 2012, Ozzie et al., “Erasure Coded Storage Aggregation in Data Centers”, 47 pages. |
“PA File Sight 3.7”, retrieved from the internet Aug. 19, 2009 at <<http://www.poweradmin.com/file-sight>>. |
Tang et al., “Differentiated Object Placement and Location for Self-organizing Storage Clusters”, UCSB Technical Report, Nov. 2002, 14 pages. |
Tang et al., “Sorrento: A Self-Organizing Storage Cluster for Parallel Data-Intensive Applications”, UCSB, 2003, <http://www.cs.ucsb.edu/research/tech—reports/reports/2003-30.pdf.>, 14 pages. |
Vogels, “File system usage in Windows NT 4.0”, 17th ACM Symposium on Operating Systems Principles (SOSP'99), 1999, pp. 93-pp. 109. |
Zhipeng et al., “Dynamic Replication Strategies for Object Storage Systems*”, EUC Workshops 2006, LNCS 4097, pp. 53-pp. 61. |
Aguilera, “Using Erasure Codes Efficiently for Storage in a Distributed System”, In International Conference on Dependable Systems and Networks, Jun. 28, 2005, 20 pages. |
Blaum, “Partial-MDS Codes and Their Application to RAID Type of Architectures”, CoRR, May 2012, 36 pages. |
“Coding for Networked Storage Systems”, retrieved on Mar. 7, 2013 at <<http://sands.sce.ntu.edu.sg/CodingForNetworkedStorage/#>>, Nanyang Technological University, 11 pages. |
Goldberg et al., “Towards an Archival Intermemory”, Advances in Digital Libraries, 1998, 10 pages. |
Karlsson et al., “A Framework for Evaluating Replica Placement Algorithms”, Hewlett-Packard Company, 2002, 13 pages. |
Kubiatowicz et al., “OceanStore: An Architecture for Global-Scale Persistent Storage”, Proceedings of the Ninth International Conference on Architectural Support for Programming Languages and Operating Systems, 2000, 12 pages. |
Lin et al., “A Decentralized Repair Mechanism for Decentralized Erasure Code based Storage Systems”, IEEE 10th International Conference on Trust, Nov. 16, 2011, 8 pages. |
Rawat et al., “Optimal Locally Repairable and Secure Codes for Distributed Storage Systems”, In arXiv:1210.6954, Oct. 25, 2012, 18 pages. |
Sathiamoorthy et al., “XORing Elephants”, In arXiv preprint arXiv:1301.3791, Jan. 16, 2013, 16 pages. |
Schechter, “On the Inversion of Certain Matrices”, In Journal of Mathematical Tables and Other Aids to Computation, vol. 13, No. 66, May 1959, 5 pages. |
Number | Date | Country | |
---|---|---|---|
20140310571 A1 | Oct 2014 | US |