A data center is a facility that houses servers, data storage devices, and/or other associated components such as backup power supplies, redundant data communications connections, environmental controls such as air conditioning and/or fire suppression, and/or various security systems. A data center may be maintained by an information technology (IT) service provider. An enterprise may purchase data storage and/or data processing services from the provider in order to run applications that handle the enterprises' core business and operational data. The applications may be proprietary and used exclusively by the enterprise or made available through a network for anyone to access and use.
Virtual computing instances (VCIs) have been introduced to lower data center capital investment in facilities and operational expenses and reduce energy consumption. A VCI is a software implementation of a computer that executes application software analogously to a physical computer. VCIs have the advantage of not being bound to physical resources, which allows VCIs to be moved around and scaled to meet changing demands of an enterprise without affecting the use of the enterprise's applications. In a software defined data center, storage resources may be allocated to VCIs in various ways, such as through network attached storage (NAS), a storage area network (SAN) such as fiber channel and/or Internet small computer system interface (iSCSI), a virtual SAN, and/or raw device mappings, among others.
Snapshots and clones may be utilized in a software defined data center to provide backups and/or disaster recovery. In some instances, as clones are created over time a chain of clones may deepen to such an extent that performance degrades during reads.
The term “virtual computing instance” (VCI) refers generally to an isolated user space instance, which can be executed within a virtualized environment. Other technologies aside from hardware virtualization can provide isolated user space instances, also referred to as data compute nodes. Data compute nodes may include non-virtualized physical hosts, VCIs, containers that run on top of a host operating system without a hypervisor or separate operating system, and/or hypervisor kernel network interface modules, among others. Hypervisor kernel network interface modules are non-VCI data compute nodes that include a network stack with a hypervisor kernel network interface and receive/transmit threads.
VCIs, in some embodiments, operate with their own guest operating systems on a host using resources of the host virtualized by virtualization software (e.g., a hypervisor, virtual machine monitor, etc.). The tenant (i.e., the owner of the VCI) can choose which applications to operate on top of the guest operating system. Some containers, on the other hand, are constructs that run on top of a host operating system without the need for a hypervisor or separate guest operating system. The host operating system can use name spaces to isolate the containers from each other and therefore can provide operating-system level segregation of the different groups of applications that operate within different containers. This segregation is akin to the VCI segregation that may be offered in hypervisor-virtualized environments that virtualize system hardware, and thus can be viewed as a form of virtualization that isolates different groups of applications that operate in different containers. Such containers may be more lightweight than VCIs.
While the specification refers generally to VCIs, the examples given could be any type of data compute node, including physical hosts, VCIs, non-VCI containers, and hypervisor kernel network interface modules. Embodiments of the present disclosure can include combinations of different types of data compute nodes.
As used herein with respect to VCIs, a “disk” is a representation of memory resources (e.g., memory resources 110 illustrated in
A VCI snapshot (referred to herein simply as “snapshot”) can preserve the state of a VCI so that it can be reverted to at a later point in time. The snapshot can include memory as well. In some embodiments, a snapshot includes secondary storage, while primary storage is optionally included with the snapshot. A snapshot can store changes from a parent snapshot (e.g., without storing an entire copy of the parent snapshot). A clone VCI (referred to herein simply as “clone”) is a copy of an existing VCI. A clone can be created from a snapshot. A clone can start a chain of snapshots.
As referred to herein, a snapshot tree can be represented as a tree of disks and can include both snapshots and clones. The snapshot tree can become complex as the clone levels increase (e.g., the quantity of clones increases), and supporting clones may become computationally expensive as the snapshot tree grows. As previously discussed, a snapshot stores only changes from a previous snapshot rather than an entire copy of the previous snapshot. Thus, each snapshot may have its own unique logical map that includes tuples mapping logical block addresses to physical block addresses. An example tuple may be “L10→P100, N20,” where a logical block address, L10, maps to a physical block address, P100, with a total number of blocks, N20. Because snapshots store only changes, if a logical address was not written in a given snapshot, its logical map may make no reference to it.
In operation, then, when reading a particular logical address, the current snapshot is initially consulted. If, however, the particular address to be read was never written in the current snapshot, the chain of the current snapshot is consulted in reverse chronological order beginning with the snapshot previous to the current snapshot. If the particular address to be read was never written in any snapshots of the current snapshot chain, a snapshot (or snapshot chain) at a higher level in the snapshot tree would be consulted next. This path from the current snapshot, through its previous snapshots, to a root node (snapshot) of the tree can be referred to as a “diskchain.” In an example snapshot tree with 10 levels of clones, it may take up to 10 lookups to find a desired tuple. The performance of deeper chains such as this may be undesirably slow.
Embodiments of the present disclosure can address these issues through the creation of a particular kind of snapshot, which is referred to herein as a “consolidated snapshot,” for certain clones. A consolidated snapshot is placed at the beginning of a snapshot chain and includes tuples (e.g., all tuples) in its diskchain. In some embodiments, any read will complete at a consolidated snapshot without requiring consultation of snapshots or snapshot chains at higher levels in the diskchain because the consolidated snapshot includes all tuples from all previous snapshots in the diskchain.
A consolidated snapshot can bound the number of lookups used to find a desired tuple, even in deep chains. As a result, performance degradation resulting from deep diskchain lookups can be diminished. Consolidated snapshots can be created in the background without affecting regular input/output and can be put into effect when their creation is complete. Consolidated snapshots may be created in multiple circumstances. In some embodiments, a consolidated snapshot is created when a threshold clone depth (e.g., 10 clones) is reached. In some embodiments, a consolidated snapshot is created when read requests associated with a clone generate a threshold quantity, or proportion, of reads to previous clones in the diskchain (e.g., reads to “old” data). In such embodiments, a consolidated snapshot may be created even if a threshold clone depth has not yet been reached. The trigger(s) for the creation of consolidated snapshots may be configurable. The trigger(s) for the creation of consolidated snapshots may depend on system operating parameters, such as central processing unit (CPU) speed, frequency of read requests, locality of read data, and/or performance requirements, for instance.
The figures herein follow a numbering convention in which the first digit or digits correspond to the drawing figure number and the remaining digits identify an element or component in the drawing. Similar elements or components between different figures may be identified by the use of similar digits. For example, 114 may reference element “14” in
The host 102 can incorporate a hypervisor 104 that can execute a number of virtual computing instances 106-1, 106-2, . . . , 106-N (referred to generally herein as “VCIs 106”). The VCIs can be provisioned with processing resources 108 and/or memory resources 110 and can communicate via the network interface 112. The processing resources 108 and the memory resources 110 provisioned to the VCIs can be local and/or remote to the host 102. For example, in a software defined data center, the VCIs 106 can be provisioned with resources that are generally available to the software defined data center and not tied to any particular hardware device. By way of example, the memory resources 110 can include volatile and/or non-volatile memory available to the VCIs 106. The VCIs 106 can be moved to different hosts (not specifically illustrated), such that a different hypervisor manages the VCIs 106. The host 102 can be in communication with a consolidated snapshot system 114. An example of the consolidated snapshot system is illustrated and described in more detail with respect to
As shown in the example illustrated in
When the tree 216 exceeds a threshold, a consolidated snapshot 230 (labeled “s0” in
As shown in
The consolidated snapshot can be stored as metadata associated with the storage of other data corresponding to the clone for which the consolidated snapshot is created. The consolidated snapshot 230 can include information of each logical map from the snapshots in the diskchain of clone J. The consolidated snapshot 230 can include tuples from the diskchain of the clone J. In some embodiments, the consolidated snapshot 230 includes all tuples from the diskchain of the clone J. Stated differently, the consolidated snapshot 230 can include tuples from each of snapshots 12, 8, 7, 3, 2, 1. The consolidated snapshot S0 can be created in a background process without affecting input/output operations of the software defined data center and/or the corresponding VCI.
The number of engines can include a combination of hardware and program instructions that is configured to perform a number of functions described herein. The program instructions (e.g., software, firmware, etc.) can be stored in a memory resource (e.g., machine-readable medium) and/or can be provided as hard-wired program instructions (e.g., logic). Hard-wired program instructions can be considered as both program instructions and hardware.
In some embodiments, the clone engine 356 can include a combination of hardware and program instructions that is configured to create a clone of a VCI in a snapshot tree provided by a software defined data center. In some embodiments, the snapshot engine 358 can include a combination of hardware and program instructions that is configured to take a snapshot of the clone. In some embodiments, the consolidated snapshot engine 360 can include a combination of hardware and program instructions that is configured to create a consolidated snapshot including tuples from a diskchain of the clone in the snapshot tree responsive to a determination that the snapshot tree exceeds a threshold.
The creation of the consolidated snapshot can be carried out by issuing a read request (e.g., a “dummy” read request) in order to receive a logical-to-physical address mapping of all tuples from the diskchain. For instance, the consolidated snapshot engine 360 can be configured to issue a read request for a current logical address and a number of addresses (e.g., a number of blocks) corresponding to the VCI and receive, responsive to the read request, a logical-to-physical address mapping associated with the VCI, wherein the logical-to-physical address mapping includes all tuples from the diskchain of the clone. In some embodiments, the consolidated snapshot engine 360 can include a combination of hardware and program instructions that is configured to add the consolidated snapshot to the snapshot tree in a snapshot chain of the clone ahead of the snapshot of the clone.
In some embodiments, the snapshot engine 358 can include a combination of hardware and program instructions that is configured to subsequently take a plurality of snapshots, wherein each of the plurality of snapshots is a snapshot of the clone at a respective time instance, and the consolidated snapshot engine 360 can include a combination of hardware and program instructions that is configured to add the consolidated snapshot to a beginning of a snapshot chain of the clone that includes the snapshot of the clone and the plurality of snapshots.
Though not specifically shown in
Memory resources 410 can be non-transitory and can include volatile and/or non-volatile memory. Volatile memory can include memory that depends upon power to store information, such as various types of dynamic random access memory (DRAM) among others. Non-volatile memory can include memory that does not depend upon power to store information. Examples of non-volatile memory can include solid state media such as flash memory, electrically erasable programmable read-only memory (EEPROM), phase change memory (PCM), 3D cross-point, ferroelectric transistor random access memory (FeTRAM), ferroelectric random access memory (FeRAM), magneto random access memory (MRAM), Spin Transfer Torque (STT)-MRAM, conductive bridging RAM (CBRAM), resistive random access memory (RRAM), oxide based RRAM (OxRAM), negative-or (NOR) flash memory, magnetic memory, optical memory, and/or a solid state drive (SSD), Non-Volatile Memory Express (NVMe) device, etc., as well as other types of machine-readable media.
The processing resources 408 can be coupled to the memory resources 410 via a communication path 464. The communication path 464 can be local or remote to the machine 462. Examples of a local communication path 464 can include an electronic bus internal to a machine, where the memory resources 410 are in communication with the processing resources 408 via the electronic bus. Examples of such electronic buses can include Industry Standard Architecture (ISA), Peripheral Component Interconnect (PCI), Advanced Technology Attachment (ATA), Small Computer System Interface (SCSI), Universal Serial Bus (USB), among other types of electronic buses and variants thereof. The communication path 464 can be such that the memory resources 410 are remote from the processing resources 408, such as in a network connection between the memory resources 410 and the processing resources 408. That is, the communication path 464 can be a network connection. Examples of such a network connection can include a local area network (LAN), wide area network (WAN), personal area network (PAN), and the Internet, among others.
As shown in
Each of the number of modules 456, 460 can include program instructions and/or a combination of hardware and program instructions that, when executed by a processing resource 408, can function as a corresponding engine as described with respect to
The machine 462 can include a clone module 456, which can include instructions to create a clone of a VCI in a snapshot tree provided by a software defined data center. The machine 462 can include a consolidated snapshot module 456, which can include instructions to create the consolidated snapshot including all tuples from the diskchain of the clone in the snapshot tree.
An example of executable instructions to create the consolidated snapshot is:
The consolidated snapshot module 456 can include instructions to determine that the snapshot tree exceeds the threshold, including instructions to determine that the creation of the clone caused the snapshot tree to have a clone depth that exceeds the threshold. The consolidated snapshot module 456 can include instructions to determine that the snapshot tree exceeds the threshold, including instructions to count read requests associated with the clone that cause ascension through the diskchain in order to satisfy the read requests and determine that the count exceeds the threshold.
In some embodiments, a threshold, as described herein, can be set and/or determined based on a speed of a processing resource configured to execute the VCI. In some embodiments, a threshold, as described herein, can be set and/or determined based on a quantity of reads of the tuples from the diskchain of the clone. In some embodiments, a threshold, as described herein, can be set and/or determined based on a performance target associated with the software defined data center. In some embodiments, a threshold, as described herein, can be set and/or determined based on a locality of the tuples from the diskchain of the clone.
The consolidated snapshot module 456 can include instructions to receive a read request for a logical address corresponding to the VCI after the creation of the consolidated snapshot and search the consolidated snapshot for a tuple corresponding to the logical address responsive to a determination that the logical address is not in any snapshots of a snapshot chain of the clone.
In some embodiments, a method for supporting clones with consolidated snapshots can include adding the consolidated snapshot to a beginning of a snapshot chain of the clone that includes the snapshot of the clone. As shown in the example previously discussed in connection with
In some embodiments, a method for supporting clones with consolidated snapshots can include maintaining a clone table that associates each of the plurality of clones in the snapshot tree with a respective clone depth identifier. In some embodiments, such an identifier may be represented by the levels illustrated in
In some embodiments, a method for supporting clones with consolidated snapshots can include determining that the snapshot tree exceeds the threshold based on a value of the respective clone depth identifier associated with the clone exceeding the threshold. In some embodiments, the clone depth identifier can exceed the threshold if a numerical value of the clone depth identifier exceeds a numerical threshold. In some embodiments, the clone depth identifier can exceed the threshold if the clone depth identifier is determined to be a particular identifier (e.g., “J” in the example illustrated in
In some embodiments, a method for supporting clones with consolidated snapshots can include using a counter to determine a quantity of read requests for a logical address corresponding to the VCI that cause ascension through the diskchain of the clone and determining that the snapshot tree exceeds the threshold based on the quantity of read requests exceeding the threshold. As previously discussed, exceeding the threshold may include read requests associated with a clone generating a threshold quantity of reads to previous clones in the diskchain (e.g., reads to “old” data).
In some embodiments, a method for supporting clones with consolidated snapshots can include creating a child clone of the clone before creating the consolidated snapshot, adding an indication of the child clone and a corresponding identifier of the child clone to the clone table before creating the consolidated snapshot, and modifying the identifier of the child clone in the clone table responsive to the creation of the consolidated snapshot. A child clone of a clone for which a consolidated snapshot was created can be updated in the clone table such that the threshold is “reset.” For example, if a consolidated snapshot is created for a clone at a tenth level, an identifier associated with its child clone that indicates an eleventh level can be modified to indicate a first level. Therefore, the count to a next threshold level can begin at the child clone.
In some embodiments, a method for supporting clones with consolidated snapshots can include receiving a read request for a logical address corresponding to the VCI subsequent to creating the clone and searching the consolidated snapshot for a tuple corresponding to the logical address if the logical address was not written in any snapshots of the snapshot chain of the clone. When a read request is received, embodiments herein search the current snapshot first. If the logical address was not written in the current snapshot, the previous snapshot can be searched. That process can continue towards the base of the tree until the consolidated snapshot is reached. Because the consolidated snapshot includes tuples from the diskchain (e.g., from the entire diskchain) the logical address can be read from the consolidated snapshot.
At 676, the method can include creating a consolidated snapshot including tuples from a diskchain of the clone in the snapshot tree responsive to a determination that the snapshot tree exceeds a threshold.
The present disclosure is not limited to particular devices or methods, which may vary. The terminology used herein is for the purpose of describing particular embodiments, and is not intended to be limiting. As used herein, the singular forms “a”, “an”, and “the” include singular and plural referents unless the content clearly dictates otherwise. Furthermore, the words “can” and “may” are used throughout this application in a permissive sense (i.e., having the potential to, being able to), not in a mandatory sense (i.e., must). The term “include,” and derivations thereof, mean “including, but not limited to.”
Although specific embodiments have been described above, these embodiments are not intended to limit the scope of the present disclosure, even where only a single embodiment is described with respect to a particular feature. Examples of features provided in the disclosure are intended to be illustrative rather than restrictive unless stated otherwise. The above description is intended to cover such alternatives, modifications, and equivalents as would be apparent to a person skilled in the art having the benefit of this disclosure.
The scope of the present disclosure includes any feature or combination of features disclosed herein (either explicitly or implicitly), or any generalization thereof, whether or not it mitigates any or all of the problems addressed herein. Various advantages of the present disclosure have been described herein, but embodiments may provide some, all, or none of such advantages, or may provide other advantages.
In the foregoing Detailed Description, some features are grouped together in a single embodiment for the purpose of streamlining the disclosure. This method of disclosure is not to be interpreted as reflecting an intention that the disclosed embodiments of the present disclosure have to use more features than are expressly recited in each claim. Rather, as the following claims reflect, inventive subject matter lies in less than all features of a single disclosed embodiment. Thus, the following claims are hereby incorporated into the Detailed Description, with each claim standing on its own as a separate embodiment.