In a software-defined data center (SDDC), virtual infrastructure, which includes virtual compute, storage, and networking resources, is provisioned from hardware infrastructure that includes a plurality of host computers, storage devices, and networking devices. The provisioning of the virtual infrastructure is carried out by management software that communicates with virtualization software (e.g., hypervisor) installed in the host computers.
As described in U.S. patent application Ser. No. 17/464,733, filed on Sep. 2, 2021, the entire contents of which are incorporated by reference herein, the desired state of the SDDC, which specifies the configuration of the SDDC (e.g., number of clusters, hosts that each cluster would manage, and whether or not certain features, such as distributed resource scheduling, high availability, and workload control plane, are enabled), may be defined in a declarative document, and the SDDC is deployed or upgraded according to the desired state defined in the declarative document.
The declarative approach has simplified the deployment and upgrading of the SDDC configuration, but may still be insufficient by itself to meet the needs of customers who have multiple SDDCs deployed across different geographical regions, and deployed in a hybrid manner, e.g., on-premise, in a public cloud, or as a service. These customers want to ensure that all of their SDDCs are compliant with company policies, and are looking for an easier way to monitor their SDDCs for compliance with the company policies and manage the upgrade and remediation of such SDDCs.
One or more embodiments provide cloud services for centrally managing the SDDCs. These cloud services rely on agents running in a cloud gateway appliance to deliver the cloud services to customer environments in which their SDDCs are deployed. New cloud services are delivered by installing new agents and existing cloud services are updated by upgrading the agents already installed.
One or more embodiments also provide a method of managing the lifecycle of agents of cloud services that are running in customer environments according to a desired state of the agents. The method includes the steps of: comparing a running state of the agents against the desired state; upon determining that the running state includes a first agent that is not present in the desired state, removing the first agent; and upon determining that the desired state includes a second agent that is not present in the running state, deploying the second agent.
Further embodiments include a non-transitory computer-readable storage medium comprising instructions that cause a computer system to carry out the above method, as well as a computer system configured to carry out the above method.
One or more embodiments provide a method of managing the lifecycle of agents of cloud services running in cloud gateway appliances according to a desired state. The agents work with their associated cloud services to expose the service functionality to virtual infrastructure management servers that manage SDDCs. The lifecycle of these agents is tied to the lifecycle of the cloud services, not the customer's SDDC lifecycle. As a result, these agents isolate the SDDCs from the velocity of the cloud service changes. Managing the lifecycle of these agents according to the desired state is desirable because: (1) it requires no human intervention in the customer environments to keep the agents up to date; (2) agent update cycle is decoupled from upgrades of the cloud gateway appliances; (3) it requires only one configuration to start new agents, upgrade, deprecate, or remove an existing agent, and apply any configuration updates to the agents, and (4) it results in zero drift from the desired state and so all the latest available cloud services as well as any updates can be delivered to the customer seamlessly.
A plurality of SDDCs is depicted in
The VIM servers in each customer environment communicate with a gateway (GW) appliance, which hosts agents that communicate with cloud control plane 12 to deliver cloud services to the corresponding customer environment. For example, the VIM servers for managing the SDDCs in customer environment 21 communicate with GW appliance 31. Similarly, the VIM servers for managing the SDDCs in customer environment 22 communicate with GW appliance 32, and the VIM servers for managing the SDDCs in customer environment 23 communicate with GW appliance 33. Examples of cloud services that are delivered to the respective customer environments through the agents include SDDC inventory management, SDDC configuration management, and upgrading of the VIM servers with reduced downtime.
As used herein, a “customer environment” means one or more private data centers managed by the customer, which is commonly referred to as “on-prem,” a private cloud managed by the customer, a public cloud managed for the customer by another organization, or any combination of these. In addition, the SDDCs of any one customer may be deployed in a hybrid manner, e.g., on-premise, in a public cloud, or as a service, and across different geographical regions.
In the embodiments, the lifecycle of agents is managed according to a desired state of the agents. The desired state of the agents may be specified through UI/API 11 and expressed in a desired state specification.
Two cloud services are depicted in
Components of GW appliance 31 depicted in
At step 416, coordinator agent 203 selects one agent to determine if the running state needs to be remediated to match the desired state. If the selected agent is in the desired state but not in the running state (step 418, Yes), coordinator agent 203 determines this agent to be a new agent and at step 419 pulls a container image of the new agent from a location of the container image specified in the desired state specification, and invokes an API of scheduler agent 201 to deploy the new agent with the running configuration defined in the desired state specification. If the selected agent is in the running state but not in the desired state (step 420, Yes), coordinator agent 203 determines this agent needs to be removed and at step 422 invokes an API of scheduler agent 201 to remove this agent. If the configuration of the selected agent defined in the desired state is different from the configuration of the selected agent defined in the running state, coordinator agent 203 determines the configuration of the selected agent to be in drift (step 424, Yes), and at step 426 carries out the update agent process illustrated in
At step 428, coordinator agent 203 determines if there is another agent to select for remediation. If there is (step 428, Yes), the method returns to step 418, where another agent is selected. If there is no more (step 428, No), coordinator agent 203 at step 430 replaces the running state that is stored in memory with the desired state so that the desired state now becomes the running state, and the method ends.
The method of
If the ready response is returned by scheduler agent 201 within the timeout period (step 514, Yes), coordinator agent 203 sets a timer at step 518. The time value set in the timer represents a time period for testing the operational health of the updated agent. At step 520, coordinator agent 203 issues APIs (e.g., to health monitoring agents running in the GW appliance) to begin monitoring the operational health of the updated agent. Then, at step 522, coordinator agent 203 instructs proxy server 205 to redirect traffic destined for the old agent to the updated agent.
If the agent update is cancelled or errors are detected in the operational health of the updated agent (step 524, Yes), coordinator agent 203 at step 526 instructs proxy server 205 to route traffic destined for the old agent back to the old agent, and invokes an API of scheduler agent 201 to remove the updated agent. The method ends thereafter.
On the other hand, if the agent update is not cancelled and no errors are detected in the operational health of the updated agent (step 524, No) during the entire time period for testing the operational health of the updated agent (step 530, Yes), coordinator agent 203 at step 532 invokes an API of scheduler agent 201 to remove the old agent. The method ends thereafter.
The embodiments described herein may employ various computer-implemented operations involving data stored in computer systems. For example, these operations may require physical manipulation of physical quantities. Usually, though not necessarily, these quantities may take the form of electrical or magnetic signals, where the quantities or representations of the quantities can be stored, transferred, combined, compared, or otherwise manipulated. Such manipulations are often referred to in terms such as producing, identifying, determining, or comparing. Any operations described herein that form part of one or more embodiments may be useful machine operations.
One or more embodiments of the invention also relate to a device or an apparatus for performing these operations. The apparatus may be specially constructed for required purposes, or the apparatus may be a general-purpose computer selectively activated or configured by a computer program stored in the computer. Various general-purpose machines may be used with computer programs written in accordance with the teachings herein, or it may be more convenient to construct a more specialized apparatus to perform the required operations.
The embodiments described herein may be practiced with other computer system configurations including hand-held devices, microprocessor systems, microprocessor-based or programmable consumer electronics, minicomputers, mainframe computers, etc.
One or more embodiments of the present invention may be implemented as one or more computer programs or as one or more computer program modules embodied in computer readable media. The term computer readable medium refers to any data storage device that can store data which can thereafter be input to a computer system. Computer readable media may be based on any existing or subsequently developed technology that embodies computer programs in a manner that enables a computer to read the programs. Examples of computer readable media are hard drives, NAS systems, read-only memory (ROM), RAM, compact disks (CDs), digital versatile disks (DVDs), magnetic tapes, and other optical and non-optical data storage devices. A computer readable medium can also be distributed over a network-coupled computer system so that the computer readable code is stored and executed in a distributed fashion.
Although one or more embodiments of the present invention have been described in some detail for clarity of understanding, certain changes may be made within the scope of the claims. Accordingly, the described embodiments are to be considered as illustrative and not restrictive, and the scope of the claims is not to be limited to details given herein but may be modified within the scope and equivalents of the claims. In the claims, elements and/or steps do not imply any particular order of operation unless explicitly stated in the claims.
Virtualization systems in accordance with the various embodiments may be implemented as hosted embodiments, non-hosted embodiments, or as embodiments that blur distinctions between the two. Furthermore, various virtualization operations may be wholly or partially implemented in hardware. For example, a hardware implementation may employ a look-up table for modification of storage access requests to secure non-disk data.
Many variations, additions, and improvements are possible, regardless of the degree of virtualization. The virtualization software can therefore include components of a host, console, or guest OS that perform virtualization functions.
Plural instances may be provided for components, operations, or structures described herein as a single instance. Boundaries between components, operations, and data stores are somewhat arbitrary, and particular operations are illustrated in the context of specific illustrative configurations. Other allocations of functionality are envisioned and may fall within the scope of the invention. In general, structures and functionalities presented as separate components in exemplary configurations may be implemented as a combined structure or component. Similarly, structures and functionalities presented as a single component may be implemented as separate components. These and other variations, additions, and improvements may fall within the scope of the appended claims.