Benefit is claimed under 35 U.S.C. 119(a)-(d) to Foreign Application Serial No. 202141055478 filed in India entitled “SYSTEM AND METHOD FOR UPGRADING A MANAGEMENT COMPONENT OF A COMPUTING ENVIRONMENT USING HIGH AVAILABILITY FEATURES”, on Nov. 30, 2021, by VMware, Inc., which is herein incorporated in its entirety by reference for all purposes.
Various computing architectures can be deployed in a public cloud as a cloud service. As an example, one or more software-defined data centers (SDDCs) may be deployed in a dedicated private cloud environment of a public cloud for an entity or customer via a cloud service provider, where each SDDC may include one or more clusters of host computers. Such dedicated private cloud environments may be managed by a cloud service provider, which uses a public cloud operated by a public cloud provider.
In a dedicated private cloud environment, there may be multiple management components that support the virtual infrastructure of the environment. For example, a dedicated private cloud environment may include a virtualization manager that manages a cluster of host computers and a software-defined network (SDN) manager that manages SDN components in the dedicated private cloud environment to provide logical networking services. If a management component needs to be upgraded, any service interruption due to the upgrade process should be minimized. In addition, any compute resources needed for the upgrade process should also be minimized.
A system and method for upgrading a source management component of a computing environment uses a target management component that is deployed in a host computer of the computing environment. The source and target management components are set as a primary-secondary management pair for a high availability system such that the source management component is set as a primary protected component and the target management component is set as a secondary unprotected component. After services of the source management component are stopped and the target management component is powered on, the primary-secondary management pair is modified to switch the source management component to the secondary unprotected component and the target management component to the primary protected component. Services of the target management component are then started to take over responsibilities of the source management component.
A computer-implemented method for upgrading a source management component of a computing environment in accordance with an embodiment of the invention includes deploying a target management component in a host computer of the computing environment, setting the source and target management components as a primary-secondary management pair for a high availability system such that the source management component is set as a primary protected component for the high availability system and the target management component is set as a secondary unprotected component for the high availability system, after setting the source and target management components as the primary-secondary management pair, stopping services of the source management component, powering on the target management component, after powering on the target management component, modifying the primary-secondary management pair to switch the source management component to the secondary unprotected component and the target management component to the primary protected component, and after modifying the primary-secondary management pair, starting services of the target management component to take over responsibilities of the source management component. In some embodiments, the steps of this method are performed when program instructions contained in a non-transitory computer-readable storage medium are executed by one or more processors.
A system in accordance with an embodiment of the invention comprises memory and at least one processor configured to deploy a target management component in a host computer of the computing environment, wherein the computing environment includes a source management component, set the source and target management components as a primary-secondary management pair for a high availability system such that the source management component is set as a primary protected component for the high availability system and the target management component is set as a secondary unprotected component for the high availability system, after the source and target management components are set as the primary-secondary management pair, stop services of the source management component, power on the target management component, after the target management component is powered on, modify the primary-secondary management pair to switch the source management component to the secondary unprotected component and the target management component to the primary protected component, and after the primary-secondary management pair is modified, start services of the target management component to take over responsibilities of the source management component.
Other aspects and advantages of embodiments of the present invention will become apparent from the following detailed description, taken in conjunction with the accompanying drawings, illustrated by way of example of the principles of the invention.
The embodiments of the invention are illustrated by way of example, and not by way of limitation, in the figures of the accompanying drawings in which:
Throughout the description, similar reference numbers may be used to identify similar elements.
It will be readily understood that the components of the embodiments as generally described herein and illustrated in the appended figures could be arranged and designed in a wide variety of different configurations. Thus, the following more detailed description of various embodiments, as represented in the figures, is not intended to limit the scope of the present disclosure, but is merely representative of various embodiments. While the various aspects of the embodiments are presented in drawings, the drawings are not necessarily drawn to scale unless specifically indicated.
The present invention may be embodied in other specific forms without departing from its spirit or essential characteristics. The described embodiments are to be considered in all respects only as illustrative and not restrictive. The scope of the invention is, therefore, indicated by the appended claims rather than by this detailed description. All changes which come within the meaning and range of equivalency of the claims are to be embraced within their scope.
Reference throughout this specification to features, advantages, or similar language does not imply that all of the features and advantages that may be realized with the present invention should be or are in any single embodiment of the invention. Rather, language referring to the features and advantages is understood to mean that a specific feature, advantage, or characteristic described in connection with an embodiment is included in at least one embodiment of the present invention. Thus, discussions of the features and advantages, and similar language, throughout this specification may, but do not necessarily, refer to the same embodiment.
Furthermore, the described features, advantages, and characteristics of the invention may be combined in any suitable manner in one or more embodiments. One skilled in the relevant art will recognize, in light of the description herein, that the invention can be practiced without one or more of the specific features or advantages of a particular embodiment. In other instances, additional features and advantages may be recognized in certain embodiments that may not be present in all embodiments of the invention.
Reference throughout this specification to “one embodiment,” “an embodiment,” or similar language means that a particular feature, structure, or characteristic described in connection with the indicated embodiment is included in at least one embodiment of the present invention. Thus, the phrases “in one embodiment,” “in an embodiment,” and similar language throughout this specification may, but do not necessarily, all refer to the same embodiment.
Turning now to
As shown in
Each host 104 may be configured to provide a virtualization layer that abstracts processor, memory, storage and networking resources of the hardware platform 106 into the virtual computing instances (VCIs), e.g., virtual machines (VMs) 116, that run concurrently on the same host. In the illustrated embodiment, the VMs 116 run on top of a software interface layer, which is referred to herein as a hypervisor 118, that enables sharing of the hardware resources of the host by the VMs. One example of the hypervisor 118 that may be used in an embodiment described herein is a VMware ESXi™ hypervisor provided as part of the VMware vSphere® solution made commercially available from VMware, Inc. The hypervisor 118 may run on top of the operating system of the host or directly on hardware components of the host. For other types of VCIs, the host may include other virtualization software platforms to support those VCIs, such as Docker virtualization platform to support “containers”. Although embodiments of the inventions may involve other types of VCIs, various embodiments of the invention are described herein as involving VMs.
In the illustrated embodiment, the hypervisor 118 includes a logical network (LN) agent 120, which operates to provide logical networking capabilities, also referred to as “software-defined networking” (SDN). Each logical network may include software managed and implemented network services, such as bridging, L3 routing, L2 switching, network address translation (NAT), and firewall capabilities, to support one or more logical overlay networks in the computing environment 100. The logical network agent 120 may receive configuration information from a logical network manager 124 (which may include a control plane cluster) and, based on this information, populates forwarding, firewall and/or other action tables for dropping or directing packets between the VMs 116 in the host 104, other VMs on other hosts, and/or other devices outside of the computing environment 100. Collectively, the logical network agent 120, together with other logical network agents on other hosts, according to their forwarding/routing tables, implement isolated overlay networks that can connect arbitrarily selected VMs with each other. Each VM may be arbitrarily assigned a particular logical network in a manner that decouples the overlay network topology from the underlying physical network. Generally, this is achieved by encapsulating packets at a source host and decapsulating packets at a destination host so that VMs on the source and destination can communicate without regard to the underlying physical network topology. In a particular implementation, the logical network agent 120 may include a Virtual Extensible Local Area Network (VXLAN) Tunnel End Point or VTEP that operates to execute operations with respect to encapsulation and decapsulation of packets to support a VXLAN backed overlay network. In alternate implementations, VTEPs support other tunneling protocols, such as stateless transport tunneling (STT), Network Virtualization using Generic Routing Encapsulation (NVGRE), or Geneve, instead of, or in addition to, VXLAN.
The hypervisor 118 further includes a local scheduler 126 and a high availability (HA) agent 128. As described in more detail below, the local scheduler 126 operates as a part of a resource scheduling system that provides load balancing among enabled hosts 104 in the cluster 102. The HA agent 128 operates as a part of a high availability system that provides high availability of select VMs running on the hosts 104 by monitoring the hosts 104 in the cluster 102, and in the event of a host failure, the VMs on the failed host are restarted on alternate hosts in cluster.
The computing environment 100 also includes a virtualization cluster manager (VCM) 130 that communicates with the hosts 104 via a management network 132. In an embodiment, the VCM 130 is a computer program that resides and executes in a computer system, such as one of the hosts 104, or in a virtual computing instance, such as one of the VMs 116 running on the hosts 104. One example of the VCM 130 is the VMware vCenter Server® product made available from VMware, Inc. In an embodiment, the VCM 130 is configured to carry out administrative tasks for the cluster 102 of hosts 104 that forms an SDDC, including managing the hosts in the cluster and managing the virtual machines running within each host in the cluster, as well as other tasks.
In the illustrated embodiment, the VCM 130 includes a cluster service 134, which contains a distributed resource scheduler (DRS) 136 and a high availability (HA) management module 138. One example of the cluster service 134 is the VXPD service found in the VMware vCenter Server® product made available from VMware, Inc. The DRS 136, which is the management component of the resource scheduling system, operates with the local schedulers 126 of the hosts 104 in the cluster 102 to provide resource scheduling and load balancing for the cluster 102. Thus, the DRS 136 with the help of the local schedulers 126 can provide host recommendations to place VMs for initial placement or load balancing. In addition, the DRS 136 can also enforce user-defined resource allocation polices. One example of the resource scheduling system is the VMware vSphere® Distributed Resource Scheduler™ system of the VMware vSphere® product made available from VMware, Inc.
The HA management module 138, which is the management component of the HA system, operates with the HA agents 128 of the hosts 104 in the cluster 102 to provide high availability for VMs running in the cluster 102. In a case of failover, the HA management module 138 with the help of the HA agents 128 can restart VMs on a failed host on other hosts in the cluster 102, e.g., on hosts recommended by the resource scheduling system, i.e., the DRS 136 and the local schedulers 126. In some embodiments, when a HA cluster of select hosts 104 is created, a single host is automatically elected as the primary host. The remaining hosts in the HA cluster are referred to herein as secondary hosts. Using its HA agent, the primary host communicates with the VCM 130 and monitors the state of all protected VMs and of the secondary hosts, i.e., the other hosts in the HA cluster. The HA agent of the primary host is referred to herein as the primary HA agent. One example of the HA system is the VMware vSphere® High Availability system of the VMware vSphere® product made available from VMware, Inc. In this example, the HA agents in the hosts are known as fault domain manager (FDM) agents.
The VCM 130 also includes a lifecycle management (LCM) service 140, which manages tasks related installing software in the cluster 102, maintaining it through updates and upgrades, and decommissioning it. As an example, the LCM service 140 may install hypervisors and firmware on new hosts in the cluster, and update or upgrade them when required. In addition, as described in more detail below, the LCM service 140 may orchestrate the process of upgrading the VCM 130.
As noted above, the computing environment 100 also includes the logical network manager 124 (which may include a control plane cluster), which operates with the logical network agents 120 in the hosts 104 to manage and control logical overlay networks in the computing environment. Logical overlay networks comprise logical network devices and connections that are mapped to physical networking resources, e.g., switches and routers, in a manner analogous to the manner in which other physical resources as compute and storage are virtualized. In an embodiment, the logical network manager 124 has access to information regarding physical components and logical overlay network components in the computing environment 100. With the physical and logical overlay network information, the logical network manager 124 is able to map logical network configurations to the physical network components that convey, route, and filter physical traffic in the computing environment 100. In one particular implementation, the logical network manager 124 is a VMware NSX® Manager™ product running on any computer, such as one of the hosts 104 or VMs 116 in the computing environment 100.
The computing environment 100 also includes an edge services gateway 142 to control network traffic into and out of the computing environment 100. One example of the edge services gateway 142 is VMware NSX® Edge™ product made available from VMware, Inc.
In an embodiment, each of the VCM 130, the logical network manager 124 and the edge services gateway 142 may be implemented in a virtual computing instance, e.g., a VM, running in the computing environment 100. In some embodiments, there may be multiple instances of the logical network manager 124 and the edge services gateway 142 that are deployed in multiple VMs running in the computing environment 100.
The management components in the computing environment 100, such as the VCM 130 and the logical network manager 124, may need to be upgraded periodically. Upgrade of these management components can introduce new features, fix bugs and errors, and improve the functionality of the computing environment 100. However, for a management component upgrade, any service interruption due to the upgrade process needs to be minimized. In addition, the upgrade process should not require additional compute resources beyond the resource capacity of the computing environment 100, which can add to the cost operating the computing environment 100.
As described in more detail below, in the computing environment 100, a management component, e.g., the VCM 130, is upgraded using a new upgraded management component deployed in the computing environment without using any additional resources of the computing environment. In addition, the upgrade process uses an HA-related mechanism to ensure that, in case of a failure, the original management component or the new upgraded management component is failed over depending on the state of the upgrade process when the failure occurred. If the failure occurs before an HA switchover phase of the upgrade process, only the original management component is failed over or restarted on another host in the computing environment 100. However, if the failure occurs after the HA switchover phase of the upgrade process, only the new upgraded management component is failed over or restarted on another host in the computing environment 100. The HA switchover phase of the upgrade process is described in detail below. Although embodiments of the invention may be applied to any management component in a computing environment using any virtual computing instances, the upgrade process is described herein for a virtual cluster manager, such as the VCM 130, which is implemented in a VM. Thus, in this disclosure, the term “VCM” and “VCM VM” may sometimes be used interchangeably.
In accordance with embodiments of the invention, in the computing environment 100, the VCM 130 is upgraded by first deploying an upgraded version of the VCM, which will take over the responsibilities of the original VCM, e.g., services provided by the original VCM. In this disclosure, the original VCM being upgraded will be referred to herein as the source VCM, while the new upgraded VCM will be referred to herein as the target VCM. The target VCM is deployed in a chunk or slot of resources, e.g., compute, memory and/or storage resources, provided by one of the hosts 104, that is reserved for failover of management VMs, e.g., the VM with the VCM 130.
The resource chunk is part of failover capacity reserved for management VMs by the resource scheduling system for the HA system in the computing environment 100 through an HA admission control policy, which may be specific to the public cloud in which the computing environment 100 resides. Thus, the resource chunk is provided in the computing environment as part of the resources of the failover capacity. In an embodiment, the size of the resource chunk may be selected to equal the largest management VM (typically, the VCM VM, but in some cases, the logical network manager VM). These resource chucks ensure that, given a certain failure model, capacity is available to restart (or fail over) the management appliances or VMs that are required by adding a healthy replacement host into the cluster 102. If this set of critical management appliances is available, then the reserved capacity can be used to restart other management VMs. When all management VMs are available, any remaining reserved capacity can be used for customer VMs. The addition of replacement host or hosts ensures that, in the end, cluster capacity gets restored and all VMs can be restarted. In the event of a failure, the HA system in the computing environment 100 fails over all the affected powered-on VMs by default unless it is configured to ignore any. The VMs which the HA system will attempt to fail over are called “protected VMs”, which are recorded in a VM protection list on a shared datastore by the HA system. This shared datastore is accessible by all the hosts in the cluster 102 so that the HA system can fail over every protected VM to another host with sufficient capacity.
The VCM upgrade process in accordance with embodiments of the invention also include steps to set the target VCM VM as an HA unprotected VM once the target VCM VM is deployed. However, the source VCM VM is left unchanged as an HA protected VM so that only the source VCM VM is failed over if there is a host failure. In an embodiment, the source and target VCM VMs are set as an initial primary-secondary management VM preemptive pair, where the source VCM VM is set as a primary protected VM and the target VCM VM is set as a secondary unprotected preemptible VM.
The VCM upgrade process in accordance with embodiments of the invention further include steps to switch the target VCM VM to an HA protected VM and the source VCM VM to an HA unprotected VM, just before the target VCM takes over the services of the source VCM. Thus, after this point in the VCM upgrade process, only the target VCM VM is failed over if there is a host failure. In an embodiment, the source and target VCM VMs are set as a switched primary-secondary management VM preemptive pair, where the target VCM VM is set as the primary protected VM and the source VCM VM is set as the secondary unprotected preemptible VM.
The process of upgrading the VCM 130 in the computing environment 100 in accordance with an embodiment of the invention is described with reference to
Initially, the computing environment 100 has only the source VCM, which has services 1.0, including the LCM service 1.0, and state information 1.0, as illustrated in
Next, the deploy the target VCM phase of the VCM upgrade process is executed. In this phase, the target VCM is deployed as a VM such that the target VCM VM is placed in the resource chunk RC2 reserved for HA failover in the host H3 by the source VCM VM, with help from the resource scheduling system in the computing environment 100, as indicated by the arrow 2 in
Next, the expand phase of the VCM upgrade process is executed. In this phase, the state information 2.0 for the target VCM is added to the source VCM to prepare for the next phase, as illustrated in
Next, the replicate phase of the VCM upgrade process is executed. In this phase, data synchronization from the source VCM to the target VCM is triggered by the target VCM, as indicated by the arrow 4 in
Next, the switchover phase of the VCM upgrade process is executed. In this phase, a shutdown of the source VCM is triggered or initiated by the target VCM, as indicated by the arrow 5 in
Next, the contract phase of the VCM upgrade process is executed. In this phase, once the source VCM is determined to be down by the target VCM, steps are triggered or initiated by the target VCM to take over as the new VCM in charge of the cluster 102. As an example, these steps may include making final modification to the database of the target VCM, taking over the internet protocol (IP) address of the source VCM, updating the logical network manager 124 and starting VCM services. In addition, the HA agents 128 in the hosts H1, H2 and H3 in the cluster 102 are upgraded by the target VCM, as indicated by the arrows 8 in
The process of upgrading the VCM 130 in the computing environment 100 in accordance with an embodiment of the invention is further described with reference to
At block 502, the initialize phase is executed by the source LCM service in response to an initialize API call from the requesting entity, which can be a user using a user interface or a software process running in the computing environment or in another computing environment, such as a management computing environment of a cloud service provider. As an input to this API, an initialization specification is passed over to the source LCM service. The initialization specification contains various parameters for the VCM upgrade process, such as, if the source VCM is to be shut down after the upgrade and where should the target VCM be deployed. The following is an example of the initialization specification that may be used:
Blocks 504-514 are part of the stage phase of the VCM upgrade process that includes an HA initialize process executed by the source LCM service. At block 504, an operation is executed by the source LCM service to deploy the target VCM VM with a preemptible configuration in response to a stage API from the requesting entity. In an embodiment, the target VCM VM is deployed with a temporary IP. Other parameters are set according to the initialization specification. However, at this point, the target VCM VM does not contain data configured in the source VCM by the user, which is stored in the configuration files and the database of the source VCM. As a result, the target VCM VM with the target LCM service is deployed, at block 506. In an embodiment, the target VCM VM is deployed in a resource chunk of one of the hosts 104 in the cluster 102, which is part of the spare HA resource capacity reserved for failover of management VMs, such as the source VCM VM.
Next, at block 508, source-to-target information for the deployed target VCM VM is pushed to the primary HA agent from the source LCM service via the cluster service 134 of the source VCM. This source-to-target information is needed by the HA system to map the target VCM to the source VCM so that the HA system can power off the target VCM when the source VCM is shut down. The presence of a VM as a target also ensures that this VM is preemptible and is not protected by the HA system. As a result, the target VCM VM is set as an HA unprotected VM by the primary HA agent, at block 510.
Next, at block 512, an operation to turn on the target VCM VM is executed by the source VCM. As a result, the target VCM VM is powered on, at block 514. After the target VCM VM has been powered on, the VCM upgrade process proceeds when the primary HA agent reports success to the source LCM service with respect to setting the target VCM VM as an HA unprotected VM. In an embodiment, an inquiry may be made to the primary HA agent by the source VCM service to check to see whether the target VCM VM has been successfully set as an HA unprotected VM. The HA initialize process is completed when the target VCM VM is reported as having been successfully set as an HA unprotected VM. The HA initialize process is described in more detail below with reference to
Next, at block 516, the prepare phase of the VCM upgrade process is executed by the source LCM service in response to a prepare API call from the requesting entity. Execution of the prepare phase expands the data configured in the source VCM to include data needed for the target VCM and starts to synchronize this information between the source VCM and the target VCM. Thus, the prepare phase corresponds to the expand and replicate phases described above with respect to
Next, at block 520, the IP addresses of the hosts 104 in the cluster 102 are fetched from the cluster service 134 of the source VCM by the source LCM service in response to a switchover API call from the requesting entity. These IP addresses are needed by the source LCM service to communicate with the host having the primary HA agent in the cluster 102 since the cluster service, which is a non-lifecycle service, will be shut down soon. Then, at block 522, non-lifecycle services at the source VCM, such as the cluster service (e.g., vxpd service), appliance management service, authentication service, certificate service, lookup service, security token service, etc., are shut down by the source LCM service.
Next, at block 524, the configuration setting for HA preemptible option is removed from the database of the target VCM VM by the target LCM service in response to instructions from the source LCM service. In an embodiment, the configuration setting for HA preemptible option is a flag in the database of the target VCM VM.
Next, at block 526, an atomic HA switchover operation is executed by the source LCM service. The atomic HA switchover operation includes resetting the restart priority for the target VCM VM from the HA unprotected VM status to the HA protected VM status and resetting the restart priority for the source VCM VM from the HA protected VM status to the HA unprotected VM status. As part of this operation, a request to run HA switchover is transmitted to the primary HA agent from the source LCM service. In response to the request, the source VCM VM is switched to the HA unprotected VM status and the target VCM VM is switched to the HA protected VM status by the primary HA agent, at block 528. In an embodiment, this operation performed by the primary HA agent may involve removing the source VCM VM from the VM protected list and adding the target VCM VM to the VM protected list. The VCM upgrade process proceeds when the primary HA agent reports success to the source LCM service with respect to switching the target VCM VM to the HA protected VM status. In an embodiment, an inquiry may be made to the primary HA agent by the source LCM service to check whether the target VCM VM has been successfully set as an HA protected VM. The HA switchover process is completed when the target VCM VM is reported as having been successfully set as an HA protected VM. The HA switchover process is described in more detail below with reference to
Next, at block 530, the source VCM VM is shut down by the target VCM. In some embodiments, the source VCM VM may be deleted after it has been shut down. At block 532, the non-lifecycle services of the target VCM are started by the target LCM service to take over the services that were previously provided by the source VCM, which means that the VCM of the computing environment has been successfully upgraded. The successful upgrade of the VCM of the computing environment may be notified to the requesting entity by the target LCM service. The upgrade process then comes to an end.
Turning now to
At the start of the HA initialize workflow, an instruction to deploy the target VCM VM with the HA preemptible configuration, which sets the HA preemptive VM flag as true for the target VCM VM, is sent to the source cluster service by the source LCM service, as indicated by the arrow 602. In an embodiment, the HA preemptive VM flag for the target VCM VM may be stored in shared datastore in the cluster 102 by the HA system. This configuration of the target VCM VM is passed to the DRS 136 in the source cluster service so that the resource scheduling system will treat the target VCM VM as a preemptible VM in case of host failures.
Next, an instruction is sent to the primary HA agent in the cluster 102 from the source LCM service to set the target VCM VM as an HA unprotected VM, as indicated by the arrow 604. In an embodiment, one or more API calls are made from the source LCM service to the primary HA agent to set the source VCM VM as the primary protected VM and the target VCM VM as the secondary unprotected preemptible VM, where the source and target VCM VMs are defined as a primary-secondary management VM preemptive pair. Upon receiving the instruction, a rule for the HA system is added by the target HA agent to not protect the target VCM VM. In some implementations, the protected VM list maintained by the HA system is updated by the primary HA agent to include the target VCM VM as an unprotected VM.
Next, an instruction is sent from the source LCM service to the target LCM service via the source cluster service to power on the target VCM VM, as indicated by the arrows 606 and 608. The target VCM VM is now ready to take over as the VCM for the cluster 102.
Next, as indicated by the arrow 610, an inquiry to the primary HA agent is made by the source LCM service to check whether the target VCM VM has been set properly as an HA unprotected VM. In an embodiment, an API call to the primary HA agent from the source LCM service is used to check if the primary-secondary management VM preemptive pair for the source VCM VM and the target VCM VM has been set properly. If the response from the primary HA agent states or indicates that the target VCM VM has not been set as an HA unprotected VM, i.e., the operation or setting of the primary-secondary management VM preemptive pair for the source and target VCM VMs had failed, as indicated by the arrow 612, another attempt is made to try to set the target VCM VM as an HA unprotected VM. After three (3) failure responses, the VCM upgrade process is declared to have failed, and an upgrade failure notification is transmitted to the requesting entity from the source LCM service, as indicated by the arrow 614. These retries introduce a fail-safe from a change of the primary HA agent while the source LCM service is communicating with the previous primary HA agent. However, before three (3) failures, if the response from the primary HA agent states that the target VCM VM has been set as an HA unprotected VM, the HA initialize workflow is determined by the source LCM service to have been successfully completed, as indicated by the arrow 616. The VCM upgrade process is then allowed to proceed.
Turning now to
At the start of the HA switchover workflow, the target LCM service is called by the source LCM service to trigger the removal of the HA preemptive VM flag from the database of the target VCM, as indicated by the arrow 702. In response, the database of the target VCM is started by the target LCM service and the entry of the HA preemptive VM flag is removed from the database.
Next, the primary HA agent is called by the source LCM service to switch the HA protected statuses of the source and target VCM VMs, as indicated by the arrow 704. Specifically, this step involves an operation for the primary HA agent to update the source VCM VM as the secondary unprotected preemptible VM and to update the target VCM VM as the primary protected VM. In a particular implementation, the operation includes removing the source VCM VM from the protected VM list and adding the target VCM VM to the protected VM list.
Next, primary HA agent is called by the source LCM service to get the status of the primary-secondary management VM preemptive pair of the source and target VCM VMs to check if the primary-secondary management VM preemptive pair for the source and target VCM VMs has been modified properly, as indicated by the arrow 706. If the response from the primary HA agent shows or indicates that the primary-secondary management VM preemptive pair for the source and target VCM VMs has not been modified properly, as indicated by the arrow 708, another attempt is made to try to properly modify the primary-secondary management VM preemptive pair for the source and target VCM VMs. After three (3) failure responses, the VCM upgrade process is declared to have failed, and an upgrade failure notification is transmitted to the requesting entity from the source LCM service, as indicated by the arrow 710. However, before three (3) failures, if the response from the primary HA agent show that the primary-secondary management VM preemptive pair for the source and target VCM VMs has been modified properly, as indicated by the arrow 712, the HA switchover workflow is determined by the source LCM service to have been successfully completed.
A computer-implemented method for upgrading a source management component of a computing environment in accordance with an embodiment of the invention is described with reference to a flow diagram of
The embodiments of the invention described herein are also applicable to hybrid clouds and multi-cloud environments as well. The upgrade can be enabled to use the optimized deployment using HA slots for target deployment by a capability API. Once the upgrade is enabled, the source VCM can take care of where to deploy the target VCM, based on user parameters passed to the source VCM. In a hybrid cloud, the source VCM can be located on-premises of the hybrid cloud and the target VCM can be deployed in the public cloud of the hybrid cloud and vice versa as long as both the on-premises cluster and the public cloud cluster are in the same domain and are managed by the same VCM. In a multi-cloud environment, the source VCM can be located in one cloud and the target VCM can be deployed in another cloud as long as the clusters on both of the clouds are managed by a single VCM.
Although the operations of the method(s) herein are shown and described in a particular order, the order of the operations of each method may be altered so that certain operations may be performed in an inverse order or so that certain operations may be performed, at least in part, concurrently with other operations. In another embodiment, instructions or sub-operations of distinct operations may be implemented in an intermittent and/or alternating manner.
It should also be noted that at least some of the operations for the methods may be implemented using software instructions stored on a computer useable storage medium for execution by a computer. As an example, an embodiment of a computer program product includes a computer useable storage medium to store a computer readable program that, when executed on a computer, causes the computer to perform operations, as described herein.
Furthermore, embodiments of at least portions of the invention can take the form of a computer program product accessible from a computer-usable or computer-readable medium providing program code for use by or in connection with a computer or any instruction execution system. For the purposes of this description, a computer-usable or computer readable medium can be any apparatus that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device.
The computer-useable or computer-readable medium can be an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system (or apparatus or device), or a propagation medium. Examples of a computer-readable medium include a semiconductor or solid-state memory, magnetic tape, a removable computer diskette, a random-access memory (RAM), a read-only memory (ROM), a rigid magnetic disc, and an optical disc. Current examples of optical discs include a compact disc with read only memory (CD-ROM), a compact disc with read/write (CD-R/W), a digital video disc (DVD), and a Blu-ray disc.
In the above description, specific details of various embodiments are provided. However, some embodiments may be practiced with less than all of these specific details. In other instances, certain methods, procedures, components, structures, and/or functions are described in no more detail than to enable the various embodiments of the invention, for the sake of brevity and clarity.
Although specific embodiments of the invention have been described and illustrated, the invention is not to be limited to the specific forms or arrangements of parts so described and illustrated. The scope of the invention is to be defined by the claims appended hereto and their equivalents.
Number | Date | Country | Kind |
---|---|---|---|
202141055478 | Nov 2021 | IN | national |