A client computing device, such as a host server or the like, may store data in a primary storage array, and may execute workloads against the data stored in the primary storage array. In some examples, the data stored in the primary storage array may be backed up in a backup appliance, separate from the client computing device and the primary storage array, for redundancy and data protection purposes, or the like. In some examples, the backup appliance may store data in a deduplicated form such that the data is stored more compactly than on the primary storage array.
The following detailed description references the drawings, wherein:
A client computing device, such as a host server or the like, may access a data volume on a primary storage array when performing workloads associated with application(s) on the client computing device. The client computing device may also communicate with a backup computing device to perform backup related tasks, such as creating snapshots of the data volume on the primary storage array. The backup computing device may also act as an interface between the client computing device and a deduplication backup appliance that stores data backups in a deduplicated form.
For example, the client computing device may be able to instruct the backup computing device to create, on the deduplication backup appliance, a deduplicated backup copy of a data volume or snapshot stored on the primary array. The backup computing device may also enable the client computing device to recover data from the deduplicated backups. In some examples, the backup computing device may present the client computing device with a mountable block device presentation of a set of backup object(s) representing a data volume or snapshot that has been backed up to the deduplication backup appliance. In such examples, the client computing device may request portions of data from the block device presentation, and the backup computing device may be able to return those portions of data from the corresponding backup objects stored in the deduplication backup appliance. The backup computing device may also receive writes to the block device presentation from the client computing device, and may store the data of those writes in a cache at the backup computing device.
However, the changes included in those writes may not be applied to the backup objects behind the block device presentation. For example, backup objects may be held immutable for several reasons, such as compliance with legal regulations and maintaining the fidelity of the original backup objects. As such, the writes to the block device presentation may not be persistently maintained, and may be lost upon a restart or power cycle of the backup computing device.
To address these issues, examples described herein may make the cached writes to the block device presentation persistent by creating a new set of backup objects on the deduplication backup appliance that include representations of the cached writes to the block device presentation. In such examples, the new set of backup objects may be created from the first set of backup objects representing the data of the block device presentation before the writes, but with changes to reflect the changes in the cached data writes (i.e., with the data of the writes applied to the new backup objects).
In such examples, enabling writes to a block device presentation to be made persistent may allow the backup computing device to be more useful for live activities to be performed against the block device presentations. For example, workloads may be performed against the block device presentation at the backup computing device, and the results of those workloads on the data of the block device presentation may be stored persistently. In such examples, this may reduce the load on the primary storage array, by enabling the backup computing device to be more usefully used for performing workloads against a block device presentation, which does not utilize resources of a primary storage array, rather than against a volume or snapshot of a primary storage array.
For example, in examples described herein, a backup agent may receive, from a client computing device, at least one transient write including data to write to a block device presentation of data represented by first backup objects that include data representations and are stored in a deduplication backup appliance, and the backup agent may store, in a cache, the data received in each transient write to the block device presentation, wherein the block device presentation is presented to the client computing device by the backup agent. In some examples, the backup agent may receive, from the client computing device, a request to persistently store the data of the at least one transient write stored in the cache for the block device presentation, the request specifying how to persistently store the data. In some examples, when the request specifies to persistently store the data in new backup objects, the backup agent may cause the deduplication backup appliance to store second backup objects representing the data stored in the cache for each transient write, such that the second backup objects contain the same data representations as the first backup objects except where replaced by at least one data representation of data stored in the cache for a transient write.
Referring now to the drawings,
Client computing device 150 may also communicate with a backup computing device 100 to perform backup related tasks, such as creating snapshots of base virtual volume 162 on storage array 160. Backup computing device 100 may also act as an interface between client computing device 150 and a deduplication backup appliance 170 that stores data backups in a deduplicated form. In the example of
In the example of
In some examples, client computing device 150 may instruct backup agent 121 to create, on deduplication backup appliance 170, a deduplicated backup copy of a data volume or snapshot stored on the primary array. For example, client computing device 150 may instruct backup agent 121 to create a backup copy of snapshot virtual volume 164 on deduplication backup appliance 170. In such examples, instructions 122, when executed, may read snapshot virtual volume 164 from storage array 160, and cause deduplication backup appliance 170 to store first backup objects 200, representing snapshot virtual volume 164, on deduplication backup appliance 170.
In examples described herein, a “deduplication backup appliance” may be a computing device, such as a storage array or the like, that stores data in a deduplicated form. In the example of
If a hash of a chunk has not already been encountered for the given store on the deduplication backup appliance 170, then the chunk will be stored on the deduplication backup appliance 170 and the hash will be placed in a backup object at a location representing where the corresponding chunk is located in the given set of data. If a hash of a chunk has already been encountered for the given store, then the chunk is considered a duplicate and is not stored again on the deduplication backup appliance 170, as it would be duplicative of a prior version of the chunk that will be stored on the deduplication backup appliance 170, but the hash of the chunk will still be placed in a backup object at a location representing where the corresponding chunk is located in the given set of data. In examples described herein, a hash or hash value is a value resulting from applying a suitable hash function to a chunk of data. Although examples are described herein in relation use of hashes as the data representations making up backup objects, any other suitable data representation may be used. For example, the data representations may be any suitable type of data fingerprints derived using any suitable type of data fingerprint function. For example, the data fingerprints may be hashes derived using a hash function, digital signatures derived using a digital signature function, or the like.
As described above, in the example of
In the example of
In examples described herein, a “block device presentation” may be an emulation of a block device including data represented by backup object(s). In some examples, a driver or other executable instructions of backup agent 121 may implement a block device presentation of backup objects by, for example, receiving communications from a client computing device targeting a block device, and providing responses emulating the corresponding responses of a block device. In some examples, the block device presentation (e.g., emulated block device implemented by a driver) may be mountable by a client computing device as if it were an actual block device. In such examples, the client computing device may request portions of data from the block device presentation as it would from an actual block device, and backup agent 121 may be able to return those portions of data from the corresponding backup objects (and chunks) stored in deduplication backup appliance 170.
In the example of
In the example of
In the example of
In some examples, backup agent 121 may receive writes to block device presentation 130 from client computing device 150, and may store the data of those writes in a write cache 105 of backup computing device 100. However, as noted above, the changes included in those writes may not be applied to backup objects (e.g., first backup objects 200) representing the data of block device presentation 130. As noted above, backup objects may be held immutable for several reasons, such as compliance with legal regulations and maintaining the fidelity of the original backup objects. As such, the writes to block device presentation 130 held in write cache 105 may not be persistently maintained, and may be lost upon a restart or power cycle of backup computing device 100. To address these issues, examples described herein may make cached write(s) to block device presentation 130 persistent by creating a new set of backup objects on deduplication backup appliance 170 that include representations of the cached writes to the block device presentation.
Referring again to
In examples described herein, a “transient” write is a request to write data to a block device presentation, wherein the data of the request is not committed or otherwise applied to the backup objects representing the data presented in the block device presentation. As such, those writes only remain as long as the write cache 105 does not lose its data (e.g., by backup computing device 100 losing power, restarting, or the like), so those writes may be referred to herein as “transient” writes, with respect to the block device presentation. In some examples, instructions 122, when executed, do not apply (or cause the deduplication backup appliance 170 to apply) the data of any transient write to the first backup objects 200 at any time. As noted above, in some examples, backup objects may be held immutable for various reasons. In such examples, instructions 122, when executed, may never apply (or cause the deduplication backup appliance 170 to apply) the data of any transient write to the first backup objects 200 at any time.
In examples described herein, the write cache 105 may be implemented by any suitable hardware cache device(s), such as one or more volatile memory device(s), such as one or more volatile random-access memory (RAM) device(s) (e.g., dynamic random access memory (DRAM) device(s)), or the like.
In the example of
In some examples, after storing data of at least one transient write in write cache 105, as illustrated in the example of
For example, in an illustrative example in
Although an example is described above in which data of one transient write is represented in the second backup objects, instructions 122 may similarly cause deduplication backup appliance 170 to store second backup objects representing data of multiple transient writes stored in the cache.
For example, write cache 105 may contain data from a plurality of prior transient writes 180 when instructions 122 receive, from client computing device 150, the request 184 to persistently store the data of transient write(s) stored in write cache 105 for block device presentation 130 in new backup objects. In such examples, in response to the request, instructions 122, when executed, may cause deduplication backup appliance 170 to store second backup objects 250 as substantial copies of the first backup objects, having the same data representations (e.g., data fingerprints) as the first backup objects, except where the data representations are replaced in the second backup objects with representations of data of the transient writes stored in write cache 105.
In some examples, instructions 122, when executed, may cause the deduplication backup appliance 170 to store the second backup objects 250 representing the data stored in write cache 105 for the transient writes in the following manner. Instructions 122, when executed, may cause deduplication backup appliance 170 to copy, to second backup objects 170, the data representation(s) (e.g., data fingerprint(s)) of each portion of the first backup objects 200 that represent data of block device presentation 130 that is not written by any transient write whose data is stored in write cache 105. In such examples, instructions 122 may further cause the deduplication backup appliance 170 to store data representation(s) (e.g., data fingerprint(s)) of the data stored in write cache 105 for each transient write in the portion(s) of second backup objects 250 representing data of block device presentation 130 that are written by any of the transient write(s).
In such examples, to store the second backup objects 250 instructions 122, when executed, may read 186 the data representations (e.g., data fingerprints) of each of first backup objects 200 for block device presentation 130 from deduplication backup appliance 170, and may determine, based on the data stored in write cache 105 for transient write(s), which data representations (e.g., data fingerprints) from first backup objects 200 to replace in second backup objects 250 with data representations (e.g., data fingerprints) of the data stored in write cache 105 for the transient write(s) (as described above), and which data representations (e.g., data fingerprints) to copy from the first backup objects 200 to the second backup objects 205 (as described above). For example, for each data representation read from deduplication backup appliance 170, instructions 122 may determine, based on the data stored in write cache 105 for transient write(s), whether it is a data representation of a portion of the block device presentation 130 that has been written by a transient write. If so, then instructions 122 may cause deduplication backup appliance 170 to replace the data representation in the second backup objects 250 with a data representation of the data stored in write cache 105 for the transient write to the corresponding portion of the block device presentation 130. If not, then instructions 122 may cause the deduplication backup appliance 170 to copy the data representation from the first backup objects 200 to the second backup objects 250. While an individual determination may be made for each read data representation, the copy and replacement operations may be performed in groups.
In some examples, to cause the deduplication backup appliance to store the second backup objects, as described above, instructions 122, when executed, may read the data representations of each of the first backup objects 200 for the block device presentation from deduplication backup appliance 170, and determine, based on the data stored in write cache 105 for the transient write(s), which data representation(s) from first backup objects 200 to replace in second backup objects 250 with data fingerprint(s) of data stored in write cache 105, as described above. Based on the determinations, instructions 122 may cause deduplication backup appliance 170 to create and store second backup objects 250 such that they include copies of each of the data representations of the first backup objects 200 determined not to be replaced based on the data stored in write cache 105, and data representation(s) of data stored in write cache 105 to replace respective data representation(s) from the first backup objects 200.
As noted above, instructions 122 may receive, from client computing device 150, a request 184 to persistently store the data of transient write(s) 180 stored in write cache 105 for block device presentation 130.
In some examples, the request 184 may specify to persistently store the data in new snapshot. In such examples, in response to the request 184, instructions 122, when executed, may instruct a storage array 160 to create a child snapshot of an existing snapshot on storage array 160 that is represented by block device presentation 130. In the example of
In other examples, the request 184 may specify to persistently store the data in persistent storage associated with the backup agent. In such examples, in response to the request 184, instructions 122, when executed, may copy the data of each transient write for block device presentation 130 from write cache 105 to persistent storage 107 of a computing device implementing backup agent 121, such as persistent storage 107 of backup computing device 100. In some examples, the persistent storage 107 may be implemented by any non-volatile storage device(s) (e.g., flash device(s), solid state drive(s), or the like), or disk-based storage (e.g., one or more hard disk drives (HDDs)), or the like, or a combination thereof.
In the example of
In the example of
In the example of
In the example of
In some examples, backup computing device 100 is separate from the deduplication backup appliance, the client computing device, and storage array 160, as illustrated in
In other examples, backup agent 121, write cache 105, and persistent storage 107 may be implemented on deduplication backup appliance 170 (rather than on a computing device separate from deduplication backup appliance 170). In such examples, deduplication backup appliance 170 may comprise processing resource 110 and machine-readable storage medium 120 comprising instructions 122 to (at least partially) implement backup agent 121. In such examples, the computing device that implements the backup agent 121 may be the deduplication backup appliance 170.
As used herein, a “computing device” may be a server, storage device, storage array, desktop or laptop computer, switch, router, or any other processing device or equipment including a processing resource. In examples described herein, a processing resource may include, for example, one processor or multiple processors included in a single computing device or distributed across multiple computing devices. As used herein, a “processor” may be at least one of a central processing unit (CPU), a semiconductor-based microprocessor, a graphics processing unit (GPU), a field-programmable gate array (FPGA) configured to retrieve and execute instructions, other electronic circuitry suitable for the retrieval and execution instructions stored on a machine-readable storage medium, or a combination thereof. In examples described herein, the at least one processing resource 110 may fetch, decode, and execute instructions stored on storage medium 120 to perform the functionalities described above in relation to instructions stored on storage medium 120. In other examples, the functionalities of any of the instructions of storage medium 120 may be implemented in the form of electronic circuitry, in the form of executable instructions encoded on a machine-readable storage medium, or a combination thereof. The storage medium may be located either in the computing device executing the machine-readable instructions, or remote from but accessible to the computing device (e.g., via a computer network) for execution. In the example of
In other examples, the functionalities described above in relation to instructions of medium 120 may be implemented by one or more engines which may be any combination of hardware and programming to implement the functionalities of the engine(s). In examples described herein, such combinations of hardware and programming may be implemented in a number of different ways. For example, the programming for the engines may be processor executable instructions stored on at least one non-transitory machine-readable storage medium and the hardware for the engines may include at least one processing resource to execute those instructions. In some examples, the hardware may also include other electronic circuitry to at least partially implement at least one of the engine(s). In some examples, the at least one machine-readable storage medium may store instructions that, when executed by the at least one processing resource, at least partially implement some or all of the engine(s). In such examples, a computing device at least partially implementing computing system 100 may include the at least one machine-readable storage medium storing the instructions and the at least one processing resource to execute the instructions. In other examples, the engine may be implemented by electronic circuitry.
As used herein, a “machine-readable storage medium” may be any electronic, magnetic, optical, or other physical storage apparatus to contain or store information such as executable instructions, data, and the like. For example, any machine-readable storage medium described herein may be any of Random Access Memory (RAM), volatile memory, non-volatile memory, flash memory, a storage drive (e.g., a hard disk drive (HDD)), a solid state drive, any type of storage disc (e.g., a compact disc, a DVD, etc.), and the like, or a combination thereof. Further, any machine-readable storage medium described herein may be non-transitory. In examples described herein, a machine-readable storage medium or media may be part of an article (or article of manufacture). An article or article of manufacture may refer to any manufactured single component or multiple components.
In some examples, instructions of medium 120 may be part of an installation package that, when installed, may be executed by processing resource 110 to implement the functionalities described above. In such examples, storage medium 120 may be a portable medium, such as a CD, DVD, or flash drive, or a memory maintained by a server from which the installation package can be downloaded and installed. In other examples, instructions of medium 120 may be part of an application, applications, or component(s) already installed on a computing device of computing environment 101 including processing resource 110. In such examples, the storage medium 120 may include memory such as a solid state drive, non-volatile memory device, or the like. In some examples, functionalities described herein in relation to
In the example of
In the example of
Instructions 128, when executed, may receive, from client computing device 150, a request 184 to persistently store the data of the at least one transient write stored in write cache 105 for the block device presentation. In some examples, the request may specify how to persistently store the data stored in write cache 105. In examples described herein, data is stored “persistently” when it is stored in a non-volatile storage medium.
When the request 184 specifies to persistently store the data in new backup objects, instructions 129, when executed, may cause deduplication backup appliance 170 to store second backup objects 250 representing the data stored in write cache 105 for each transient write, such that second backup objects 250 contain the same data representations as first backup objects 200 except where replaced by at least one data representation of data stored in write cache 105 for a transient write (e.g., data 24), as described above in relation to
In the example of
At 315, instructions 122, when executed, may receive, from the client computing device 150, a request to persistently store, in new backup objects, the data stored in the write cache 105 for the transient writes to the block device presentation 130. At 320, in response to the request and based on the data stored in the write cache 105, instructions 122 of backup agent 121, when executed, may cause the deduplication backup appliance 170 to store second backup objects 250 representing the data stored in the write cache 105 for the transient writes, such that the second backup objects 250 contain the same data representations as the first backup objects 200 except where replaced by data representations of the data stored in the write cache 105 for the transient writes, as described above in relation to
Although the flowchart of
In the example of
At 415, instructions 122, when executed, may receive, from the client computing device 150, a request to persistently store, in new backup objects, the data stored in the write cache 105 for the transient writes to the block device presentation 130. In such examples, instructions 122 of backup agent 121, when executed, do not apply, or cause the deduplication backup appliance to apply, the data of any transient write to the first backup objects 200 stored in deduplication backup appliance 170 at any time.
At 420, instructions 122, when executed, may read data fingerprints of each of the first backup objects 200 for the block device presentation from the deduplication backup appliance. At 425, instructions 122, when executed, may determine, based on the data stored in the write cache 105 for the transient writes, which data fingerprints from the first backup objects 200 to replace in the second backup objects 250 with data fingerprints of data stored in the write cache 105, and which data fingerprints to copy from the first backup objects 200 to the second backup objects 250.
At 430, based on the determining at 425, instructions 122, when executed, may copy, to the second backup objects 250, the data fingerprints of the first backup objects 200 that represent data of the block device presentation 130 that is not written by any of the received transient writes. At 435, and also based on the determining at 425, instructions 122, when executed, may store a data fingerprint of data stored in the write cache 105 for one of the received transient writes in each portion of the second backup objects 250 that represents data of the block device presentation 130 that is written by one of the received transient writes.
Although the flowchart of