This application claims priority from Singapore Patent Application No. 10201406331V filed on Oct. 3, 2014.
The present invention relates to a storage system and, more specifically, relates to data reconstruction within such a storage system.
Ideally data reconstruction of data in a failed data storage device in a data storage system occurs as offline reconstruction in which the storage system stops replying to any client/application server in order to allow the data reconstruction process to run at full speed. However, this scenario is not practical in most production environments as most storage systems are required to provide uninterrupted data services even when they are recovering from disk failures.
Thus, what is needed is a method and device for data reconstruction which at least partially overcomes the drawbacks of present approaches by providing uninterrupted data services while recovering from disk failures. Furthermore, other desirable features and characteristics will become apparent from the subsequent detailed description and the appended claims, taken in conjunction with the accompanying drawings and this background of the disclosure.
In one aspect of the invention, a method for data reconstruction in a data storage system comprising a plurality of storage devices is provided. The method includes receiving one of a read request and a write request from a server to access data from a failed one of the plurality of storage devices and reconstructing the requested data stored in the failed one of the plurality of storage devices from portions of data stored in one or more available ones of the plurality of storage devices. The method further includes sending the requested data from the reconstructed data back to the server and sending the reconstructed data to a replacement one of the plurality of storage devices. Finally, the method includes updating a reconstruction list to indicate the replacement one of the plurality of storage devices and completion of data reconstruction.
In an additional aspect of the invention, a method for data reconstruction in a cluster of Hybrid Object Storage Devices (HOSDs) when one HOSD has failed wherein the cluster of HOSDs includes a primary HOSD is provided. The method includes identifying data in the failed HOSD which is available in non-volatile memory of the primary HOSD, copying the identified data available in the non-volatile memory of the primary HOSD to a replacement HOSD, and updating a reconstruction list in the primary HOSD to indicate the replacement HOSD and completion of data reconstruction.
In yet an additional aspect of the invention, a method for data reconstruction in a cluster of Hybrid Object Storage Devices (HOSDs) when one HOSD has failed is provided. The method includes computing data in the failed HOSD based on data available in a non-volatile memory of a primary HOSD, writing the computed data to a replacement HOSD, and updating a reconstruction list to indicate the replacement HOSD and completion of data reconstruction
In a further aspect of the present invention, a data storage system including an Erasure Code Group (ECG) cluster of Hybrid Object Storage Devices (HOSDs) is disclosed. One of the ECG cluster of HOSDs is assigned as a primary HOSD. The primary HOSD includes a non-volatile (NV) cache, a reconstruction list, a reconstruction processor and one or more communication interfaces. The NV cache includes a local cache which stores object data from the primary HOSD. The reconstruction list indicates a status of failed HOSD reconstruction. The reconstruction processor is coupled to the NV cache and the reconstruction list, the reconstruction processor reconstructing failed HOSD data and updating the status of the failed HOSD reconstruction in the reconstruction list. The one or more communication interfaces is coupled to the reconstruction processor for communicating with a client/application server and for communicating with other HOSDs in the cluster of HOSDs
The accompanying figures, where like reference numerals refer to identical or functionally similar elements throughout the separate views and which together with the detailed description below are incorporated in and form part of the specification and serve to illustrate various embodiments and to explain various principles and advantages in accordance with a present invention, by way of non-limiting example only.
Skilled artisans will appreciate that elements in the figures are illustrated for simplicity and clarity and have not necessarily been depicted to scale.
The following detailed description is merely exemplary in nature and is not intended to limit the invention or the application and uses of the invention. Furthermore, there is no intention to be bound by any theory presented in the preceding background of the invention or the following detailed description. It is the intent of this invention to present a system and methods for data reconstruction which provides uninterrupted data services while recovering from disk failures.
Referring to
Once a HOSD failure is identified, the primary HOSD 106 begins the reconstruction process. If there is a read request or a write request from the client/application server 108 to access data from the failed HOSD 110 during reconstruction, the data will be reconstructed by the primary HOSD 106 by computing data read out from other available HOSDs 104. The reconstructed data is then sent back to the client/application server 108 from the primary HOSD 106. The primary HOSD 106 can also send the data to a replacement HOSD 112 and update a reconstruction list maintained by the primary HOSD 106 to indicate that the data has been reconstructed.
Referring to
Referring to
In accordance with a first optimized reconstruction process, data in the ECG cache 304 is reconstructed when one of the HOSDs 204 in the ECG fails. The primary HOSD 206 reconstructs the data of the failed HOSD 204 in the ECG cache 304 with a high priority. The data reconstruction can be done either by directly copying the data available in the ECG cache 304 to a replacement HOSD or compute the data based on available data in the ECG cache 304 and then writing the computed data to the replacement HOSD. The primary HOSD 206 can then update the reconstruction list.
In accordance with a second optimized reconstruction process, data requested by the client/application server 108 includes data from a failed HOSD 204 in the ECG. If the read/write request from client/application server 108 to access the data from the failed HOSD is received during reconstruction, the data being accessed will be reconstructed on the fly with a high priority by computing data read out from other available HOSDs 204, and then sending the computed data back to the client/application server 108. In the meantime, the primary HOSD 206 will also send the data to a replacement HOSD and update the reconstruction list in the primary HOSD 206 to indicate that the object data has been reconstructed.
In accordance with a normal reconstruction process, the primary HOSD 206 reconstructs the data by reading data from other available HOSDs 204 and recomputing the read data to recover the data. Once completed, the primary HOSD 206 will write the recomputed data to a replacement HOSD and update the reconstruction list.
Referring to
A reconstruction list 404 indicates a status of failed HOSD reconstruction. A reconstruction processor 406 is coupled to the NV cache 402 and the reconstruction list and reconstructs failed HOSD data as well as updates the status of the failed HOSD reconstruction in the reconstruction list 404. A first communication interface 408 couples the reconstruction processor 406 to client/application server 108 for communication therewith and a second communication interface 408 couples the reconstruction processor 406 to the other HOSDs 204 in the ECG 202 for writing data to or reading data from the HOSDs 204 and for retrieving local cache data from the HOSDs 204 for storing into the ECG cache 304. The reconstruction processor 406 also communicates with the HOSDs 204 via the second communication interface to detect when one of the HOSDs 204 fails and to assign an available HOSD 204 as a replacement HOSD.
Referring to
When the reconstruction processor 406 determines 506 that the read/write request is requesting failed data, reconstruction of the requested data is prioritized so that the requested data is immediately reconstructed 508 and, once reconstructed 508, is sent 510 to the client/application server 108. In this manner, uninterrupted data services with the client/application server 108 can be conducted by the primary HOSD 206 even while the ECG 202 is recovering from a disk failure. As discussed above, the requested data can be reconstructed from object data in the ECG cache 304 or from data in the HOSDs 204.
After the requested data is sent 510 to the client/application server 108, it is then sent 512 to a replacement storage device, the replacement storage device being one of the HOSDs 204 assigned as a replacement storage device by the reconstruction processor 406. The reconstruction processor 406 then updates 514 the reconstruction list 404 to indicate the replacement one of the HOSDs 204. Normal reconstruction processing continues until either another read/write request is received 504 or processing is completed. When all reconstruction is complete, the reconstruction processor 406 updates the reconstruction list 404 to indicate the completion of data reconstruction.
Thus, it can be seen that the present embodiment can provide optimized uninterrupted data services even while recovering from disk failures. In addition, it provides advantageous methods for reconstruction of failed disks from either an Erasure Code Group (ECG) cache in a primary Hybrid Object Storage Device (HOSD) within the ECG or from one or more other HOSD in the ECG. While exemplary embodiments have been presented in the foregoing detailed description of the invention, it should be appreciated that a vast number of variations exist.
It should further be appreciated that the exemplary embodiments are only examples, and are not intended to limit the scope, applicability, operation, or configuration of the invention in any way. Rather, the foregoing detailed description will provide those skilled in the art with a convenient road map for implementing an exemplary embodiment of the invention, it being understood that various changes may be made in the function and arrangement of elements and method of operation described in an exemplary embodiment without departing from the scope of the invention as set forth in the appended claims.
Number | Date | Country | Kind |
---|---|---|---|
10201406331V | Oct 2014 | SG | national |
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/SG2015/050355 | 9/30/2015 | WO | 00 |