The present invention relates generally to data storage, and, more specifically, to systems, methods, computer program products, and apparatuses for enhancing the performance of data duplication.
Generally, massive storage systems are used to store large quantities of objects in a network environment (e.g., a cloud). These storage systems are typically designed to handle many billions of objects and tens to hundreds of petabytes of data. These storage systems may be implemented in datacenters, storage pools, or storage clusters. As time passes and storage hardware degrades, the quality of the stored objects may degrade, and the objects may become corrupted.
In order to combat this data corruption, a storage system may store redundant copies of an object in the same or redundant datacenters. When the storage system detects a corrupted object, it may repair the object by, for example, replacing the corrupted object with an uncorrupted copy. As redundancy goes up, the data durability promise (i.e., the reliability) of a storage system increases. Furthermore, increased redundancy may also improve read performance by allowing for parallel readings of multiple copies of an object.
In a typical storage system, a certain requisite number of copies (e.g., three) for each stored object is maintained at all times, for example, to ensure an acceptable reliability level. When disk failure causes certain objects to have fewer than the requisite number of copies, a series of data duplication activities are triggered to replace the missing copies. In order to maintain the system's reliability level, these data duplication activities should be performed immediately. However, as storage system disk size increases, these data duplication activities can be time intensive and require a high percentage of system resources. For example, if a 3 TB disk fails, replicating the data on the disk may take thirty minutes or more. Furthermore, the storage system's performance during this data replication timeframe may suffer due to the replication process's intensive use of system resources (e.g., processing power).
These and other problems are generally solved or circumvented, and technical advantages are generally achieved, by preferred embodiments of the present invention which provide enhanced performance for data duplication.
In accordance with an example embodiment, enhanced performance for data duplication may be achieved by creating additional copies of objects in a storage system. For example, a mechanism for data duplication receives an object for data storage and creates a requisite number of copies of the object. The requisite number of copies may be based on a minimum number of data duplication sets defined by a system policy. The mechanism stores the object and the requisite number of copies in the storage system. Furthermore, the mechanism creates and stores an additional copy of the object over the requisite number of copies in accordance with the occurrence of one or more events monitored against predetermined data duplication criteria defined by the system policy.
In accordance with another example embodiment, a mechanism for data duplication stores a plurality of objects in a storage system. A requisite number of copies, based on a minimum number of data duplication sets as defined by a system policy, is maintained for the plurality of objects. The mechanism further stores one or more additional copies over the requisite number of copies for at least some of the plurality of objects in the storage system. The storage of one or more additional copies is in accordance with the occurrence of one or more events monitored against predetermined data duplication criteria also defined by the system policy.
Other embodiments are also disclosed.
For a more complete understanding of the present invention, and the advantages thereof, reference is now made to the following descriptions taken in conjunction with the accompanying drawing, in which:
Example embodiments covering various aspects of the encompassed innovation are discussed in greater detail below. It should be appreciated, however, that the present invention provides many applicable unique and novel concepts that can be embodied in a wide variety of specific contexts. Accordingly, the specific embodiments discussed herein are merely illustrative of specific ways to make, use, and implement various aspects of the present invention, and do not necessarily limit the scope thereof unless otherwise claimed.
The following example embodiments are described in a specific context, namely a data storage system. As will be appreciated, however, such example embodiments may also be applied to other applications which use redundancy as a mechanism to achieve high reliability.
As shown in
Typically, a storage system's capacity (i.e., the total amount of storage space) may be significantly greater than the capacity needed to store existing objects and their requisite copies. For example, users often configure storage systems with extra capacity in anticipation of future storage needs. Therefore, a portion of disk array 106 may remain unused and available. In various example embodiments, as part of managing the storage of the object and its copies, additional copies of the objects in storage system 100 (i.e., more than the requisite number of copies) may be stored in these available portions of disk array 106 in accordance with an occurrence of one or more events in the storage system monitored by monitoring device 114 against predetermined data duplication criteria 112 defined within the system policy 110. Unlike the requisite copies, additional copies need not be created immediately (i.e., when storage system 100 receives the object) and may be created at a lower priority level (e.g., when storage system 100 has excess processing resources, storage space, and the like). For example, if three copies of each object in storage system 100 must be maintained for data resiliency, a fourth copy of each object may be created and stored in available portions of disk array 106 when the system has extra resources (e.g., when the system is otherwise idle). Storage system 100 may make one, two, three, or more copies of each object beyond the requisite number depending on, for example, monitoring device 114's detected use of system resources (e.g., storage space in disk array 106, system bandwidth, processing power, and the like) or other events monitored against predetermined data duplication criteria 112 defined within system policy 110 as will be discussed in greater detail below.
In accordance with various example embodiments, predetermined data duplication criteria 112 further include events for replacing corrupted data. For example, when data is corrupted in disk array 106, storage system 100 may immediately restore the requisite number of copies (e.g., three) for each object. However, if an object still has the requisite number of copies in storage system 100 even after data loss (e.g., because an additional fourth copy was stored), immediate duplication may not be necessary. Storage system 100 may create any additional copies over the requisite number of copies based on the occurrence of events monitored by monitoring device 114 compared against predetermined data duplication criteria 112, for example, when the monitoring device 114 detects the system has extra resources (e.g., excess system bandwidth, storage space, processing power, or the like). In an alternative example, storage system 100 may create any additional copies during off-peak hours (e.g., nighttime) because storage system 100 assumes excess system resources may be available during such hours. Therefore, in such embodiments, if an object has an additional copy stored in disk array 106 and one copy is lost (e.g., due to disk failure), no immediate operations may be required by storage system 100 to restore the lost object. Thus, various example embodiments allow for the expenditure of fewer system resources immediately after data loss and allows for the amortization of creating backup copies over time.
Furthermore, as disk array 106 becomes full, storage system 100 may stop making additional backup copies and/or delete any additional copies already on disk array 106 in accordance with events monitored by monitoring device 114 against predetermined data duplication criteria 112 defined by system policy 110. For example, if disk array 106's available storage space falls beneath a first configurable system storage space threshold defined by system policy 110 (e.g., 20% or 10% of the total storage space), then controller 104 may stop creating additional copies of objects. As even more storage space in disk array 106 is used to store objects and requisite copies, controller 104 may start deleting additional copies. For example, if the amount of available storage space falls below a second configurable threshold (e.g., 5%) as defined by system policy 110, then existing additional copies may be deleted.
Furthermore, in such embodiments, the predetermined data duplication criteria defined within the system policy may prioritize the making of certain additional copies. For example, the predetermined data duplication criteria may set the controller to prioritize making at least one additional copy over the requisite number of each object in a storage system first before making more additional copies for a particular object. That is, assuming the requisite number of copies in a system is three, the controller may attempt to make four copies for each object in the storage system first before making fifth and sixth copies of an object.
As another example, the predetermined data duplication criteria defined within the system policy may also prioritize making additional copies of objects that have not been updated recently, which is referred to as objects in cold storage. In such scenarios, the system may only make additional copies for objects that have not been modified for at least a certain time period (e.g., 1 hour or 1 day) to prevent making additional copies for objects that are likely to be frequently modified. This prevents the use of unnecessary resources in making additional duplications that are likely to be obsolete in the near future. Generally, the predetermined data duplication criteria defined within the system policy may prioritize making additional copies of objects that are less frequently modified over objects that are frequently updated.
As another example, the predetermined data duplication criteria defined within the system policy may prioritize making additional copies of objects that are frequently queried. Generally, when an object has more copies in the system, each query related to the object may be answered from more sources. Thus having additional copies for an object may improve query efficiency. Of course, one of ordinary skill in the art would recognize that other criteria may be used to prioritize making certain additional copies, and the examples described here are non-limiting and used for illustrative purposes only-unless otherwise explicitly claimed.
In various example embodiments, this priority defined by the system policy, as described above, may be maintained when the controller deletes additional copies, for example, due to the storage system running out of storage space (e.g., when the storage system is more than 95% full). For example, the controller may try and maintain at least one extra copy for each object (i.e., the controller will delete fifth and sixth copies of all the objects before deleting a fourth copy).
In step 204, the controller copies the identified object and manages storage of the additional copy, for example, in a disk array. In step 206, the controller updates any applicable metadata servers (e.g., server 108) with the information about the new copy.
In such embodiments, additional copies may not be immediately updated when a user requests an update. This is due to a likelihood that the object may be modified frequently within the near future. These objects may be referred to as objects in hot storage. For example, a user requesting an update may be working on a file and may request the file be updated again in the near future. If all additional copies were updated in real time, an unnecessary amount of system resources would be expended to make additional duplications over the requisite number. Therefore, various embodiments may only create additional copies of objects in cold storage (i.e., objects that has not been updated in a given time period (e.g., 1 hour or 1 day)).
The bus may be one or more of any type of several bus architectures including a memory bus or memory controller, a peripheral bus, video bus, or the like. The CPU may comprise any type of electronic data processor. The memory may comprise any type of system memory such as static random access memory (SRAM), dynamic random access memory (DRAM), synchronous DRAM (SDRAM), read-only memory (ROM), a combination thereof, or the like. In an embodiment, the memory may include ROM for use at boot-up, and DRAM for program and data storage for use while executing programs.
The mass storage device may comprise any type of storage device configured to store data, programs, and other information and to make the data, programs, and other information accessible via the bus. The mass storage device may comprise, for example, one or more of a solid state drive, hard disk drive, a magnetic disk drive, an optical disk drive, or the like.
The video adapter and the I/O interface provide interfaces to couple external input and output devices to the processing unit. As illustrated, examples of input and output devices include the display coupled to the video adapter and the mouse/keyboard/printer coupled to the I/O interface. Other devices may be coupled to the processing unit, and additional or fewer interface cards may be utilized. For example, a serial interface card (not shown) may be used to provide a serial interface for a printer.
The processing unit also includes one or more network interfaces, which may comprise wired links, such as an Ethernet cable or the like, and/or wireless links to access nodes or different networks. The network interface allows the processing unit to communicate with remote units via the networks. For example, the network interface may provide wireless communication via one or more transmitters/transmit antennas and one or more receivers/receive antennas. In an embodiment, the processing unit is coupled to a local-area network or a wide-area network for data processing and communications with remote devices, such as other processing units, the Internet, remote storage facilities, or the like.
While this invention has been described with reference to illustrative embodiments, this description is not intended to be construed in a limiting sense. Various modifications and combinations of the illustrative embodiments, as well as other embodiments of the invention, will be apparent to persons skilled in the art upon reference to the description. It is therefore intended that the appended claims encompass any such modifications or embodiments.