Benefit is claimed under 35 U.S.C. 119(a)-(d) to Foreign Application Serial No. 202241002867 filed in India entitled “CONTENT BASED READ CACHE ASSISTED WORKLOAD DEPLOYMENT”, on Jan. 18, 2022, by VMware, Inc., which is herein incorporated in its entirety by reference for all purposes.
The present disclosure relates to computing environments, and more particularly to methods, techniques, and systems for deploying workloads based on a digest file and a content based read cache (CBRC).
OVA (open virtual appliance) template file is a single archive file in TAR file format that contains files for portability and deployment of virtualization appliances (e.g., virtual machines (VMs), virtual applications, or the like). An OVA template file may include an OVF (open virtual format) descriptor file, optional manifest and certificate files, optional disk images (such as VMware vmdk files), optional resource files (such as ISO's) and other supporting files, such as a message bundle file. The OVF lies in a repository/storage device in a compressed and sparse format, allowing for faster downloads. Because of the OVF format, exchange of virtualization appliances across products and platforms can be possible.
Deployment of VMs using the OVF is similar to deploying VMs from a template, however the OVF can be deployed from any filesystem accessible from a client device (e.g., vSphere Client machine), such as CDs, Universal Serial Bus (USB), shared network drives, or remote web servers. In order to deploy a VM from an OVA template file, the files of the OVA template file, such as OVF descriptor, vmdk, manifest, certificate, and message bundle files, must be retrieved from the OVA template file at different stages of a deployment process, so that the OVA template file can be validated, and any needed file is transferred to the destination host computer on which the VM is to be deployed. Thus, if the OVA template file is located at a remote storage location, the entire OVA template file may first have to be downloaded and the files in the OVA template extracted before the extracted files can be used for deployment. Since the OVA template file is typically a large file, this approach would require a significant amount of network transfer time and thus makes the OVF deployment a time-consuming process.
The drawings described herein are for illustration purposes and are not intended to limit the scope of the present subject matter in any way.
Examples described herein may provide an enhanced computer-based and/or network-based method, technique, and system to deploy a workload on a destination host computing system in a computing environment based on a content based read cache (CBRC). Computing environment may be a virtual computing environment (e.g., a cloud computing environment, a virtualized environment, and the like). The virtual computing environment may be a pool or collection of cloud infrastructure resources designed for enterprise needs. The resources may be a processor (e.g., central processing unit (CPU)), memory (e.g., random-access memory (RAM)), storage (e.g., disk space), and networking (e.g., bandwidth). Further, the virtual computing environment may be a virtual representation of the physical data center, complete with servers, storage clusters, and networking components, all of which may reside in virtual space being hosted by one or more physical data centers. The virtual computing environment may include multiple physical computers executing different computing-instances or workloads (e.g., virtual machines (VMs), virtual appliances, template, and the like). The workloads may execute different types of applications.
In such a virtualized environment, virtual desktops may be provided as part of a virtual desktop infrastructure (VDI) or desktop-as-a-service (DAAS) offerings. A virtual desktop may be an interface available to an individual user in the virtualized environment. The virtual desktop is provided using a workload. Further, an experience of using desktop virtualization may be interpreted by users based on the responsiveness of the virtual desktop. In some examples, the responsiveness of the virtual desktops may be affected by multiple factors such as downloading an OVA template file from a remote storage device to a destination host computing system on which the workload is to be deployed. Since the OVA template file is typically a large file, this approach would require a significant amount of network transfer time and thus the OVF deployment is a time-consuming process. Also, a boot-up time of the workload may be increased.
Examples described herein provides a management node to receive a request to deploy a workload on a destination host computing system. In response to receiving the request, the management node may retrieve a digest file associated with a virtualized computing instance file corresponding to a virtual disk of the workload. The digest file may include a plurality of hash values with each hash value corresponding to a data block of a plurality of data blocks in the virtualized computing instance file. Further, the management node may determine whether the plurality of hash values in the digest file match with data in a CBRC of the destination host computing system. Furthermore, the management node may request data blocks corresponding to hash values that does not exist in the CBRC from the storage device. Also, the management node may deploy the workload on the destination host computing system upon receiving the data blocks corresponding to the hash values that does not exist in the CBRC.
Thus, examples described herein may utilize the digest file and the CBRC at the destination host computing system to reduce the delay in deployment of the workloads using OVF. When the digest file is associated with a OVF file, before getting the OVF file deployed/transferred across wire, the digest file which is small in size is initially transferred and check whether the content of the digest file is already present in the CBRC. Further, logical block numbers (LBNs) which are not present in the CBRC at the destination host computing system, where the workload is deployed, may be requested from the OVF file. Thus, examples described herein may reduce the network transfer time and also reduce or avoid delay in booting-up the deployed workload.
In the following description, for purposes of explanation, numerous specific details are set forth in order to provide a thorough understanding of the present techniques. It will be apparent, however, to one skilled in the art that the present apparatus, devices, and systems may be practiced without these specific details. Reference in the specification to “an example” or similar language means that a particular feature, structure, or characteristic described is included in at least that one example, but not necessarily in other examples.
System Overview and Examples of Operation
As shown in
As shown in
Each host computing system 122A-122N is configured to support a number of workloads, which are VMs (e.g., VM 1-VM N) in this example. The VMs share at least some of the hardware resources of host computing systems 122A-122N, which include system memory, processors, a storage interface, and a network interface. An example system memory may be random access memory (RAM), which is the primary memory of host computing systems 122A-122N. The processor can be any type of a processor, such as a central processing unit (CPU) commonly found in a server. The storage interface may be an interface that allows host computing systems 122A-122N to communicate with storage 126 that is accessible by host computing systems 122A-122N. As an example, the storage interface may be a host bus adapter or a network file system interface. The network interface may be an interface that allows host computing systems 122A-122N to communicate with other devices connected to the network. As an example, the network interface may be a network adapter.
In the example shown in
Similar to any other computing system connected to the network, VM 1-VM N are able to communicate with other computer systems connected to the network using the network interface of respective host computing systems 122A-122N. In addition, VM 1-VM N are able to access storage 126 accessible by host computing systems 122A-122N using the storage interface of respective host computing systems 122A-122N.
Further, management node 102 may operate to monitor and manage host computing systems 122A-122N. In an example, management node 102 may monitor the current configurations of host computing systems 122A-122N and the workloads, e.g., VMs, running on host computing systems 122A-122N. The monitored configurations may include hardware configuration of each of host computing systems 122A-122N, such as CPU type and memory size, and/or software configurations of each of host computing systems 122A-122N, such as operating system (OS) type and installed applications or software programs. The monitored configurations may also include workload hosting information, i.e., which workloads are hosted or running on which host computing systems 122A-122N. The monitored configurations may also include workload information. The workload information may include size of each of workloads, virtualized hardware configuration of each of the workloads, such as virtual CPU type and virtual memory size, software configuration of each of the workloads, such as OS type and installed applications or software programs running on each of the workloads, and virtual storage size for each of the workloads. The workload information may also include resource parameter settings, such as demand, limit, reservation and share values for various resources, e.g., CPU, memory, network bandwidth and storage, which are consumed by the workloads. The “demand,” or current usage, of the workloads for the consumable resources, such as CPU, memory, network, and storage, are measured by host computing systems 122A-122N hosting the workloads and provided to management node 102.
Management node 102 may also perform operations to manage the workloads VM 1 to VM N and host computing systems 122A-122N. Management node 102 may perform various resource management operations for the cluster, including migration of workloads between host computing systems 122A-122N in the cluster for load balancing. Management node 102 may also be configured to manage deployment of workloads in any of host computing systems 122A-122N using templates, such as OVA files, as explained below.
In an example, management node 102 includes a processor 104 and a memory 106 coupled to processor 104. In other examples, management node 102 may be implemented as software program running on a physical computer, such as host computing system 122A, or a virtual computer, such as VM 1. Example management node 102 may be a VMware® vCenter™ server with at least some of the features available for such server. Memory 106 may include a deployment module 108 and a digest file generation module 128. Deployment module 108 may perform various deployment-related operations so that workloads can be deployed on host computing systems 122A-122N using OVA template files on web servers (e.g., storage device 110), without having to download and store the entire OVA file from storage device 110.
During operation, digest file generation module 128 may divide a virtual disk into a plurality of data blocks having an n-word block size. An example virtual disk may be a virtual machine disk file (VMDK). The VMDK may store contents of the workload's (e.g., VM 1's) hard disk drive. Further, digest file generation module 128 may determine a hash value corresponding to each data block using a secure hash algorithm (SHA). Furthermore, digest file generation module 128 may generate a digest file112 including mapping of each data block to the corresponding hash value. The digest file can be generated for workload's VMDK. The digest file may include hash values for blocks in the VMDK and a key (e.g., name) corresponding to each hash value.
During operation, deployment module 108 may receive a request to deploy a workload (e.g., VM 1) on destination host computing system 122A. The workload may include a virtual machine (e.g., VM 1), a virtual appliance, or a virtual application. In another example, the workload may be a virtual desktop infrastructure (VDI). The VDI is a virtualization technology that hosts a desktop operating system on a centralized server in a data center.
In response to receiving the request, deployment module 108 may retrieve from storage device 110 (e.g., a web server), digest file 112 corresponding to the workload VM 1 (e.g., to be deployed). Digest file 112 may include the plurality of hash values with each hash value corresponding to a data block of the plurality of data blocks associated with the virtual disk stored in storage device 110. In an example, storage device 110 is an open virtualization format (OVF) repository locally accessible to destination host computing system 122A or remotely accessible to destination host computing system 122A via a uniform resource locator (URL).
Further, deployment module 108 may determine whether the plurality of hash values in digest file 122 match with data in CBRC 124A of destination host computing system 122A. For example, CBRC 124A is a random-access memory based read cache that helps multiple workloads with identical memory contents, based on granularity chosen while creating digest file 122, to use the physical host RAM effectively, by sharing the identical memory pages (referred to herein as “pages”) across multiple workloads. When CBRC 124A is enabled on the virtual disk, CBRC 124A may build an on-disk hash called a digest file. The digest file may provide a signature of the contents of the memory. For example, CBRC 124A first references the digest file and compares the hash with the in-memory cache. If a page with the content is already present in the memory, CBRC 124A returns the same to the workload.
Furthermore, deployment module 108 may transmit data blocks corresponding to hash values that are not present in CBRC 124A from storage device 110 to destination host computing system 122A. In an example, deployment module 108 may, for each hash value, transmit a data block corresponding to a hash value from storage device 110 to destination host computing system 122A in responsive to a determination that the hash value in digest file 112 does not exists in CBRC 124A of destination host computing system 122A. In another example, deployment module 108 may, for each hash value, mark the data block corresponding to the hash value as available and refrain from transmitting the data block corresponding to the hash value to destination host computing system 122A in responsive to a determination that the hash value in digest file 112 exists in CBRC 124A of destination host computing system 122A.
Further, deployment module 108 may initiate deployment of workload VM 1 on destination host computing system 122A in response to transmitting the data blocks corresponding to the hash values that are not present in CBRC 124A. Thus, examples described herein may utilize a digest file which is small in size and check if content of the digest file is already present in a CBRC at a destination host computing system, and only request data from the OVF file which is not there in the destination host computing system (e.g., difference of content between the OVA file and the CBRC can be transferred), thereby reducing or avoiding the transfer of the OVA file and also reducing the delay in boot-up of the deployed workload. Further, examples described herein may utilize online caching and offline hashing mechanism to reduce input/output operations per second (IOPS), reduce network latency, and improve performance during booting of the workload.
System 100 may include a user interface 152 for management node 102 to transmit a request for a VM deployment from OVA file to deployment module 108 in response to user inputs to initiate the VM deployment. The VM deployment request may include a Uniform Resource Locator (URL) for the OVA template file and may also include the name or identification of the destination host computing system 122A on which the VM is to be deployed.
Upon receiving the deployment request, deployment module 108 establishes a connection with web server 154 using the URL and reads the digest file from storage device 110 (e.g., at 158). The digest file is a crypto-hash representation of the VMDK. For example, the digest file can be created by dividing the VMDK into 4K word size blocks. For each 4k word size block, a corresponding key of that block may be determined, for instance, using a secure hash algorithm (e.g., SHA-1). Further, the data in the VMDK is mapped to corresponding keys block by block in the digest file. The creation of the digest file may be performed during a power off state of the VM. After the digest file is created successfully in the powered off state, the same VM is then powered on to make use of CBRC 124A. An OVF VM when getting deployed, is also be deployed with the digest file, or a VM before getting converted to template/OVF will be placed with the digest file along with other necessary files (e.g., OVA files) of templates/OVFs.
An example digest file 200 is shown in
Referring to
Further, a bitmap having a bit corresponding to each hash value of the plurality of hash values may be generated. Further, each bit in the bitmap may indicate a validity of the data in a corresponding data block of the virtual disk. Furthermore, the digest file may be stored in the storage device. The storage device can be an open virtualization format (OVF) repository locally accessible to the destination host computing system or remotely accessible to the destination host computing system via a uniform resource locator (URL).
At 302, the digest file corresponding to the workload may be retrieved. In an example, the digest file includes a plurality of hash values from a storage device and each hash value corresponds to a data block of a plurality of data blocks associated with the virtual disk stored in the storage device.
At 304, a check may be made to determine whether the plurality of hash values in the digest file match with data in a CBRC of a destination host computing system. At 306, data blocks corresponding to hash values that are not present in the CBRC may be obtained from the storage device to store in the destination host computing system (e.g., to store in the CBRC). In an example, for each hash value, a data block corresponding to the hash value is obtained to the destination host computing system in responsive to a determination that a hash value in the digest file does not exists in the CBRC of the destination host computing system. For example, obtaining the data blocks corresponding to the hash values that are not present in the CBRC include:
In another example, for each hash value, the data block corresponding to the hash value is marked as available and the data block corresponding to the hash value is refrained from obtaining from the storage device in responsive to a determination that the hash value in the digest file exists in the CBRC of the destination host computing system.
At 308, the workload may be deployed on the destination host computing system upon obtaining the data blocks corresponding to hash values that are not present in the CBRC. In this example, the workload may be deployed using the obtained data blocks and the data blocks that are marked as available. In an example, deploying the workload in the destination host computing system includes initial placement of the workload on the destination host computing system. In another example, deploying the workload in the destination host computing system includes placing the workload in the destination host computing system while powering on the workload on the destination host computing system. In yet another example, deploying the workload in the destination host computing system includes placing the workload in the destination host computing system during migration of the workload from a source host computing system to the destination host computing system. In another example, deploying the workload in the destination host computing system includes placing the workload on the destination host computing system during cloning of the workload.
At 406, a first hash value of the plurality of hash values may be compared with data in the CBRC. At 408, a check may be made to determine whether the data for the first hash value is available in the CBRC. When the data for the first hash value is not available, a request to send data block corresponding to the first hash value may be sent to the OVF repository, at 410. At 412, the data block corresponding to the first hash value may be received from the OVF repository using a logical block number (LBN) of the first hash value.
When the data for the first hash value is available or upon receiving the data block corresponding to the first hash value, a check is made to determine whether each of the hash values is compared with data in the CBRC (i.e., end of digest file condition), at 414. At 418, a second hash value of the plurality of hash values may be selected and the processes 408, 408, 410, 412, and 414 are repeated until all the hash values in the digest file are compared with data in the CBRC.
At 416, upon receiving data blocks corresponding to hash values that are not present in the CBRC, the VM may be deployed on the destination host computing system. In this example, the VM is ready for power on or to perform other operations.
The processes depicted in
Computer-readable storage medium 504 may store instructions 506, 508, 510, 512, and 514. Instructions 506 may be executed by processor 502 to receive a request to deploy the workload on destination host computing system 500. Instructions 508 may be executed by processor 502 to retrieve a digest file associated with a virtualized computing instance file corresponding to a virtual disk of the workload in response to receiving the request. The digest file may include a plurality of hash values with each hash value corresponding to a data block of a plurality of data blocks in the virtualized computing instance file. Further, the plurality of data blocks relating to the workload may be an open virtualization format (OVF) package or open virtualization appliance (OVA) file that packages the workload for deployment.
Instructions 510 may be executed by processor 502 to determine whether the plurality of hash values in the digest file match with data in a CBRC of destination host computing system 500. Instructions 512 may be executed by processor 502 to request data blocks corresponding to hash values that does not exist in the CBRC from the storage device. In an example, instructions 512 to request the data blocks corresponding to the hash values that does not exist in the CBRC include instructions to, for each hash value, obtain a data block corresponding to the hash value from the storage device in responsive to a determination that a hash value in the digest file does not exists in the CBRC at the destination host computing system. In another example, for each hash value, the data block corresponding to the hash value is marked as available and the data block corresponding to the hash value may be refrained from transmitting to destination host computing system 500 in responsive to a determination that the hash value in the digest file exists in the CBRC of destination host computing system 500.
Instructions 514 may be executed by processor 502 to deploy the workload on destination host computing system 500 upon receiving the data blocks corresponding to the hash values that does not exist in the CBRC. In an example, instructions 514 to deploy the workload on destination host computing system 500 include instructions to migrate or clone the workload running on a source host computing system to destination host computing system 500. In this example, the digest file may include the hash values of the CBRC of the source host computing system when the request is to migrate or clone the workload.
Some or all of the system components and/or data structures may also be stored as contents (e.g., as executable or other computer-readable software instructions or structured data) on a non-transitory computer-readable medium (e.g., as a hard disk; a computer memory; a computer network or cellular wireless network or other data transmission medium; or a portable media article to be read by an appropriate drive or via an appropriate connection, such as a DVD or flash memory device) so as to enable or configure the computer-readable medium and/or one or more host computing systems or devices to execute or otherwise use or provide the contents to perform at least some of the described techniques.
It may be noted that the above-described examples of the present solution are for the purpose of illustration only. Although the solution has been described in conjunction with a specific embodiment thereof, numerous modifications may be possible without materially departing from the teachings and advantages of the subject matter described herein. Other substitutions, modifications and changes may be made without departing from the spirit of the present solution. All of the features disclosed in this specification (including any accompanying claims, abstract and drawings), and/or all of the steps of any method or process so disclosed, may be combined in any combination, except combinations where at least some of such features and/or steps are mutually exclusive.
The terms “include,” “have,” and variations thereof, as used herein, have the same meaning as the term “comprise” or appropriate variation thereof. Furthermore, the term “based on”, as used herein, means “based at least in part on.” Thus, a feature that is described as based on some stimulus can be based on the stimulus or a combination of stimuli including the stimulus.
The present description has been shown and described with reference to the foregoing examples. It is understood, however, that other forms, details, and examples can be made without departing from the spirit and scope of the present subject matter that is defined in the following claims.
Number | Date | Country | Kind |
---|---|---|---|
202241002867 | Jan 2022 | IN | national |