The present invention is generally directed to a data recovery system and method, and in particular, to a technique for implementing a parity-based recovery system that is transparent or otherwise not apparent or visible to the host system, allowing for implementation on virtually any operating system or machine, including but not limited to Windows based machines, and allowing for the use of completely independent, swappable hard drives or storage devices that do not share data or impact the data on other drives, or impact the implementation of the data recovery system and method of the present invention.
As more and more computer users, people and companies rely on the electronic storage of data for day-to-day operations, the need for data security in the event of a minimal or catastrophic data failure is needed. Accordingly, in order to meet the demand and/or need for data security and recovery, many people resort to disk arrays, oftentimes referred to as Redundant Array of Independent or Inexpensive Disks (RAID), which utilize redundancy to provide protection for data stored on the array.
Furthermore, while RAID implementations do generally provide redundancy or data security, they also distribute input and output workload across the multiple data drives associated with the single RAID array. For instance, the device driver or interface sends communications to the RAID system, which distributes the tasks among the various disk drives. In this regard, each of the disk drives associated with a common RAID array are in fact dependent on one another in that they necessarily do not include independent file systems or independent data. For instance, data corresponding to a single data file (e.g., a single image, photo, video, document, operating system files, etc.) is spread throughout the entire array and across multiple data disks, such that some pieces may reside on one drive and other pieces may reside on a different drive. For this reason, the drives associated with a common RAID array do not contain full and independent file systems, and therefore cannot be removed from the RAID array and be usable in another computer system without reformatting or destroying all of the data contained thereon.
Furthermore, it has often been a challenge for users of RAID arrays to add new data drives to an existing RAID array, without losing or having to reformat the entire array. Traditionally, in order to add a data drive to a RAID array, the entire RAID array must be backed up, reformatted, the new drive added, then the data recovered. The steps involved are tedious, time consuming, and most importantly, requires the reformatting of a data drive and loss of existing data.
There is thus a need in the art for a new system and method for providing parity or other recovery protection while being capable of maintaining truly independent disk drives such that the system and method is transparent to the host system allowing for greater flexibility, modification, and implementation. In particular, the proposed system and method would function in a manner that is unknown to the host system, meaning that that host system communicates with each independent disk drives as if they were each separate from one another and not disposed within an array. Accordingly, each of the drives would function independent of one another, be able to be removed, replaced, and added at will, and contain full and complete file systems, rather than being disposed in a common RAID array.
The proposed system and method would function to intercept or process the input/output or read/write operations between the host computer and the disk drives in order to compute recovery or parity data. The recovery or parity data would then be stored in one or more virtual or physical drives for use if any data is lost, corrupted, or failed. Such a system and method would provide the advantages of data security without the downfall of requiring combined disks in a single array.
The present invention, as described herein, is generally directed to a system and method for implementing parity-based data recovery in a host computer system and/or implementing parity based RAID recovery schemes without the inherent downfalls associated with common or traditional RAID systems. Specifically, as will become apparent from the description herein, the system and method of the various embodiments of the present invention is structured and configured to be transparent to the host system in that the host system performs normal, standard or traditional read/write operations on attached or accessible data storage devices or hard disk drives, while the present invention performs the various tasks to implement a data recovery system and method unbeknownst to the host system.
Certain advantages of the transparency of the system and method of the present invention include, but are not limited to, that the system and method may be implemented on virtually any computer or operating system. Particularly, the system and method of the present invention will not impact or interrupt the generation and/or communication of certain input/output commands to and from the host system. This allows the present invention to be implemented without having to modify or reconfigure the operating system kernel or controller operating the IO commands with the data storage devices, as is common with implementing traditional RAID levels.
It should also be noted that, with the implementation of the system and/or method of the present invention, the data storage drives may remain independent of one another, unlike traditional RAID configurations. In particular, each drive or data storage device may be removed, replaced, or added at will without impacting the data stored on other drives, and without having to format, reformat or modify any of the drives involved. Accordingly, each of the various hard drives or data storage devices may, but need not necessarily, include different or independent file systems, e.g., NTFS, FAT32, etc.
Accordingly, the various embodiments of the present invention include a data recovery management controller, implemented via software, hardware or any combination thereof, which is structured and configured to intercept, receive, and/or otherwise process input/output or read/write commands associated with a target storage device. The data recovery management controller can be added to virtually any machine without having to modify the host system, the kernel or input/output controllers. Accordingly, as mentioned above, host system does not see a RAID implementation and instead sees independent hard drives or storage devices.
The data recovery management controller is configured to calculate parity or other recovery information or data and store the recovery data in a virtual or physical drive allocated for the same. When or if needed, the recovery data can be accessed to reconstruct failed, corrupt or lost data.
These and other objects, features and advantages of the present invention will become more apparent when the drawings as well as the detailed description are taken into consideration.
Like reference numerals refer to like parts throughout the several views of the drawings provided herein.
As shown in the accompanying drawings, the present invention is generally directed to a system 10 and method 100 for implementing parity-based data recovery in a host computer system 20. In particular, and as will become apparent from the description provided herein, the system 10 and method 100 of the various embodiments of the present invention are implemented on or otherwise accessible by one or more host computer systems 20. The host computer system 20 is cooperatively structured and configured to perform various input and output control sequences, operations or commands on one or more attached or accessible data storage device(s) 25 (e.g., virtual or physical hard disk drives) as if the system 10 and method 100 were not present. Specifically, as will become apparent from the description herein, the system 10 and method 100 of the various embodiments is structured and configured to be transparent to the host system 20 in that the host system 20 performs normal, standard or traditional read/write operations on the attached or accessible data storage devices 25, while the present invention performs the various tasks to implement a data recovery system 10 and method 100 unbeknownst to the host system 20.
Certain advantages of the transparency of the system 10 and method 100 of the present invention include, but are not limited to, that the system 10 and method 100 may be implemented on virtually any computer or operating system, such as a WINDOWS system, LINUX system, APPLE OS system, etc. Particularly, the system 10 and method 100 of the present invention will not impact or interrupt the generation and/or communication of certain input/output (10) commands to and from the host system 20. This allows the present invention to be implemented without having to modify or reconfigure the operating system kernel or controller operating the IO commands with the data storage devices 25, as is common with implementing traditional RAID levels.
Other advantages, which will be described hereinafter, include the ability to remove, replace, or add one or more data storage devices 25 to and from the host system 20 and the system 10 and method 100 of the present invention without impacting the data stored on other data storage devices 25, and without having to format, reformat or modify any of the data storage devices 25 involved. Accordingly, for exemplary purposes only, a hard drive containing already stored data (e.g., removed from another machine) can be added to the system 10 and method 100 without having to reformat the drive and without losing the preexisting data stored thereon. Similarly, a hard drive or data storage device may be removed from the system 10 and method 100 of the present invention and connected to a different machine or system without having to reformat or lose the data stored thereon. Accordingly, it should also be noted that each of the various hard drives or data storage devices 25 may, but need not necessarily, include different and/or independent file systems (e.g., NTFS, FAT32, etc.) in that each of the data storage devices 25 of the various embodiments of the present invention are independent of one another, are not spliced or combined like traditional RAID formats, and can be independently accessed, written to, read from, removed, added, etc. without impacting the system 10 or method 100 of the present invention. Particularly, the data storage devices are not dependent on one another in that each comprises a complete and independent file system, meaning that a single file system is not shared or installed on multiple storage drives. Rather, a single storage drive contains a full, complete and independent file system.
Referring now to
Particularly, still referring to
Moreover, the various data storage devices 25, as described herein, may include any virtual or physical disk or device that can be used or is used to store and/or retrieve various data and/or media in online and/or offline modes. Accordingly, the data storage device 25 may include but is certainly not limited to an internal or external hard disk drive, solid state drive, USB drive, flash drive, virtual drive, network drive, etc. Similarly, the one or more recovery storage devices 35 may include any virtual or physical device or structure capable of storing and/or retrieving recovery data for subsequent use in the event of a data or disk failure. Particularly, the various recovery storage devices 35 may include, but is certainly not limited to, to an internal or external hard disk drive, solid state drive, USB drive, flash drive, virtual drive, network drive, etc. In certain embodiments, the recovery storage device(s) 35 may be separate, independent or dedicated drives or storage units, however, in other embodiments, the recovery storage device(s) 35 may include a shared, partitioned, or dedicated space on one or more of the data storage devices 25.
In any event, the various data storage devices 25 of the present invention are considered independent storage devices in that each one can be separately formatted and/or partitioned as desired without any impact or disruption to the other storage devices 25 of the system 10 and without any impact or disruption to the data stored on the other storage devices 25 of the system 10. Furthermore, each of the independent storage devices 10 of the present invention can host one or more unique volumes, each with a different (although not required) file system, and each volume being accessible via a separate interface the device 25 presents to the host system 20. Finally, an independent data storage device 25, as used herein, can be removed or added to the system 20, as desired, and its volumes readable and/or writable in another system outside of and different from the host system 20 and recovery system 10 of the present invention. It should also be noted that the data storage devices 25 may comprise a one-to-one mapping with the host system 20, meaning that the data does not overlap from one storage device 25 to another. In contrast, traditional RAID systems will map multiple hard drives to a single controller allowing the data contained in a single read or write operation to span across multiple drives.
Furthermore, the data recovery management controller 30 of at least one embodiment of the present invention is structured to intercept, obtain, or otherwise read and process certain input and/or output commands between the host system 20 and the data storage devices 25. Specifically, in certain embodiments, the data recovery management controller 30 can obtain or process the commands by pulling them from the host computer 20, driver, etc. In other embodiments, however, the device driver may be configured to push or forward the commands to the data recovery management controller 30 of the present invention for processing thereby.
In particular, the host system 20 is structured to send read/write (and other) commands to and from a target data storage device 25, either directly or via a device driver, controller, object, etc. The target data storage device being the storage device 25 the host computer 20 is communicating with, either directly or indirectly. Particularly, oftentimes, the host computer 20 communicates with or otherwise comprises one or more device drivers or objects 22 conceptually disposed between the host computer 20 and the data storage device 25 that controls, sends, and receives commands to and from the host computer 20 and the target data storage device 25. Specifically, and for exemplary purposes only, in a WINDOWS® based machine running on a MICROSOFT WINDOWS® operating system, the device driver(s) 22 may include or otherwise implement a Windows Physical Device Object (PDO) for a disk device.
As shown in
In certain embodiments, the host system 20 may communicate with a single device driver 22 that is common to all of the storage devices 25. In that case, each storage device 25 includes or is otherwise communicatively disposed with an abstracted representation within the device driver 22. It should be noted that each storage device 25, in at least one embodiment, may be communicative with separate and distinct device drivers 22.
For example,
In certain embodiments, the data recovery management controller 30 will determine whether the command is within a predetermined tolerance level, as shown at 60 in
Still referring to
In particular, the recovery data, as used herein, may include parity data or other type of data that is configured to be used in a manner to assist with the recovery of lost or failed data, for example, due to a failed or corrupt data drive. In general, and for exemplary purposes only, parity data can include, but is not necessarily limited to, an XOR computation of all of the data blocks. The parity data of at least one embodiment is then stored in a parity or recovery drive 35. When the target drive 25 fails or becomes corrupted, the parity data can be used in combination with data that did not fail or become corrupt in an attempt to recover or reconstruct the failed data.
Furthermore, if the intercepted or forwarded command from the device driver 22 to the data recovery management controller 35 includes a read recovery command 42, such as in the case where the host system 20 attempted to read from a corrupt or failed data storage device 25, the data recovery management controller 20 is configured to determine whether the read command or read recovery command is within a certain tolerance level that can still recover or reconstruct at least some of the failed, lost or corrupt data. If so, as referenced at 54, the data recovery management controller 20 is configured to read the surviving data, or otherwise the data that is capable of being read or interpreted (if any) from the target data storage device(s) 25. The data recovery management controller 25 will further conduct a read command or otherwise obtain the recovery or parity data previously computed and stored. As shown at block 55 of
Referring now to
In certain embodiments, the data recovery management controller 30 is transparent to the host system 20 in that the host system 20 operates without regard to the data recovery management controller 30. Specifically, the host system 20 is structured to perform standard or traditional read and write operations to the data storage device(s) 25, for example, via device driver(s) 22. In this manner, the kernel or operative system of the host system 20 need not be modified or specifically configured to conform to the system 10 or method 100 of the present invention. This allows the system and method 100 of the various embodiments disclosed herein to be installed, retrofitted, or implemented on virtually any already running, operating and configured host system 20.
Furthermore, the data storage devices are defined as being independent and capable of being removed from the host system without impacting data stored on the data storage device itself or any remaining data storage devices 25 still connected to communicable with the host system 20. In particular, the data storages devices of at least one embodiment each comprise separate and independent file systems and full data schemes, meaning that the data written to each data storage device is complete and not shared amongst other data storage devices. Particularly, in a traditional RAID configuration, multiple drives are mapped together as one drive such that the host system will write data to the drives and the data will span across or be shared by multiple drives. This does not allow the drives to be separated from one another without losing data, reformatting, etc., and consequently, the drives in traditional RAID implementations are not independent.
Still referring to
Once computed, the parity or other recovery data is then stored in the parity or recovery data storage device(s) 35, as generally illustrated as reference character 106. In certain embodiments, the parity or recovery data is calculated and/or stored independently for each of the plurality of data storage devices 25. In other words, the parity data can be grouped or saved in a manner such that it is recoverable for each data storage device independently. For example, if one data storage device 25 fails, the parity data for the failed drive can be easily located or retrieved for data recovery.
Particularly, the recovery data can be used at a later time, if necessary, in order to recover or reconstruct failed or lost data, for example, in the event of a failed or corrupt drive. For instance, surviving data (e.g., data that is not corrupt, lost, or failed) can be read by the method 100′ and compared or computed with the parity or other recovery data to reconstruct, regenerate, or recover the failed or lost data. As shown at 108, the original write command may then be completed on the target device, for example, by saving the data to the target data storage device 25.
Referring now to
This written description provides an illustrative explanation and/or account of the present invention. It may be possible to deliver equivalent benefits and insights using variations of the sequence, steps, specific embodiments and methods, without departing from the inventive concept. This description and these drawings, therefore, are to be regarded as illustrative and not restrictive.
Now that the invention has been described,