An information technology (IT) infrastructure of an enterprise (e.g., a company, an educational organization, a government agency, etc.) can include a relatively large arrangement of electronic devices, software components, and database components. Often, changes are made to components in the infrastructure, which can be complex to manage.
Some embodiments are described with respect to the following figures:
Changes to an information technology (IT) infrastructure, particularly a relatively large IT infrastructure, can be complex to manage. An IT infrastructure includes hardware components (e.g., computers, storage servers, communications devices, and so forth), software components (e.g., applications, operating systems, drivers, and so forth), database components (e.g., relational database management systems, unstructured database systems, and so forth), and/or other components. In some examples, an IT Infrastructure may even include visualized systems, which include virtual machines. A physical machine can be partitioned into multiple virtual machines, and each virtual machine can appear to be an actual physical machine to a user. More generally, an “IT infrastructure” or “infrastructure” refers to an arrangement of components, such as those noted above.
Often, IT administrators of an enterprise are tasked with implementing changes to an IT infrastructure. Due to the complexity of the IT infrastructure, a manual change process can be time consuming, and can result in errors. Moreover, an IT infrastructure may include automated tools that can request or implement changes, which can lead to increased numbers of changes requested or made in the IT infrastructure. Automated tools are usually unaware of the impact of their changes on various aspects of an enterprise, and in fact, automated toots may even bypass or violate policies of the enterprise.
In accordance with some implementations, policy-based change process management mechanisms or techniques are provided to allow for (largely) automated management of change processes in an IT infrastructure. In some implementations, a workflow engine is provided to implement a change process, where the workflow engine can be associated with other modules for managing the change process. A change process results from a requested change to a part of an infrastructure. In some examples, change processes can be performed in conformance with ITIL (Information Technology Infrastructure Library) guidelines or other types of guidelines. ITIL provides best practices for IT operations.
The change process (104) includes determining (at 106), based on accessing at least one policy, whether or not transitions among the multiple phases are allowed. The determining of whether transitions among the multiple phases are allowed includes invoking a policy rule engine to apply the at least one policy for each transition between successive ones of the phases.
The change process (104) further includes invoking (at 108) exception handling by the policy rule engine in response to determining that violation of the at least one policy would result from a particular one of the transitions. In some implementations, if there are multiple violations of respective policies, then exception handling (108) can he invoked for each of the policy violations.
Generally, the workflow engine 206 is responsible for managing and executing the change process in response to a change request. The workflow engine steps through the various phases of the change process, starting from an initial phase, through any intermediate phases, and finally to a change closure phase. The workflow engine 208 ensures that an entire transaction of each change process will all occur or none will occur—in other words, every action or transition of the change process will all occur or none will occur. When the workflow engine 206 starts a change process in response to a change request, an instance 226 of the change process is created uniquely for this change request. The instance 228 of the change process is stored in persistent storage media (228) so that the change process instance can persist even after system shutdown or reset. Upon system reset, the persistent change process instance 226 can continue from the last phase.
As depicted in
The policy-based rule engine 210 is able to access policies stored in a policy database 212. A policy is generally a guideline to the change process for indicating terms and conditions for transitioning the change process between successive phases. The policy has an association condition for determining whether or not to apply the policy for a given change process (or change processes). The policy can also identify a policy owner that is to be notified in case a requested change violates the policy. A policy owner can be a human or an automated tool, such as a management application. The policy can also be associated with information to indicate to which of the phases of a change process the policy is to be applied. Such information can be expressed as a type of the policy, where the type would provide the indication of which change process phase(s) the policy is to be applied to. Alternatively, other information associated with a policy can provide the indication of which phase(s) of the change process the policy is to be applied to.
The policy can also be associated with further information that indicates actions to take with the requested change in case of violation of the policy.
Rules of the policy can be represented in expression language that provides a true or false result for a requested transition between phases of a change. process. The rules can have various conditions based on change attributes or analysis relating to the impact and risk of a particular change process.
If the policy-based rule engine 210 determines that no violation of a policy would occur for a current transition between phases of the changs process, then the policy-based rule engine implements the satisfied action 220, which is an action performed in response to a determination that the transition between the particular pair of successive phases of the change process is allowed. The satisfied action 220 can include an indication provided back to the workflow engine 206 (in result 209) that the transition between the particular phases of the change process is allowed. Additionally, it may be possible for the policy-based rule engine 210 to modify the change request as part of the exception handling 214 or the satisfied action 220. The updated change request can be provided to the change request queue 202 for further processing by the workflow engine 200.
If the policy-based rule engine 210 determines that violation of a policy would occur for a current transition between phases of the change process, then exception handling 214 is performed. Exception handling can involve invoking a policy exception engine 216, which determines how to handle the violation of the policy. The exception tending depends on the current phase of the change process, the type of policy breached, and the configuration of the policy. The policy exception engine 216 checks to ensure that all exception terms are satisfied before allowing the change process to move to the next phase. Exception terms can include, for example, notification of a policy owner, approving the violation by at least one stakeholder, or some other term.
If approval of a violation is sought prior to allowing the change process to proceed to the next phase, the policy exception engine 216 can invoke an approval engine 218, as part of the exception handling 214. The approval engine 218 can send notification containing information of the violation to one or multiple stakeholders (which can be humans and/or automated tools). In response to the notification of the violation, the at least one stakeholder can respond with approval or dis-approval of the violation. In the case of multiple stakeholders, approval can be based on a predefined combination of positive indications received from the multiple stakeholders approving of the violation. For example, the predefined combination of stakeholders can be a majority of the stakeholders. Alternatively, the predefined combination can be (1) any of the multiple stakeholders, (2) all of the multiple stakeholders, or (3) a majority of a quorum of the multiple stakeholders.
If approval is received from the at least one stakeholder regarding the violation, that indication is provided from the approval engine 218 back to the policy-based rule engine 210. which can implement the satisfied action 220. In case approval from any particular one of multiple stakeholders is no longer relevant (for instance, the majority of stakeholders have already rejected the violation or the majority has already approved), the remaining stakeholder(s) (who have not yet provided their approval or disapproval) can be notified that the remaining stakeholder(s) no longer have to provide their approval.
As further depicted in
Correlation information can be provided to specify relationships between CI(s). The change analysis engine 224 is able to access the CI that is the subject of the change request, along with any other CI that is related to the CI that is the subject of the change request. The assessment by the change analysis engine 224 identifies the CI(s) that would be affected by the change request, the probability of the impact, and/or the severity of the impact. For example, attribute(s) of a change request can indicate the component(s) of an IT infrastructure requested to be changed. For example, such a component change can include installing a program patch on a server. The CI for the server can indicate what other component(s) (associated with other CIs) would be affected if the server were to go down to install the program patch. Such other component(s) can include application(s), user(s), other server(s), and so forth. CIs can be stored in a database 226.
The change analysis engine 224 can produce a data structure that identifies CI(s) to be affected by the change request. The data structure can be in the form of an impact graph (or other structure), for example, which depicts links between the requested change and the respective CI(s). Risk calculation determines the probability of failure and potential damage, which can be based on a predefined risk function that considers various factors. The factors can include the specific CI(s) impacted, relationship of the specific CI(s) to other CI(s), the severity level and the probability of the impact, and other configurable parameters relating to the requested change. The result of the risk calculation is a measurable score level to distinguish between low risk, medium risk, or high risk. For example, in particular server going down to perform installation of a program update can cause a critical application to go down during certain time periods, which would be considered a high risk policy violation.
In some implementations, exception handling (214) may be implemented for change process transitions that are considered to be high risk, with exceptional handling not triggered for change transitions that are low or medium risk. Thus, in such implementations, a policy-based rule engine 210 would not invoke exception handling 214 for change process transitions that may violate a policy, but where the risk is considered low or medium. By invoking exception handling for just change process transitions that are considered to be high risk, the amount of exception handling performed by the system can be reduced, thereby reducing the overall load on the system in processing change requests. More generally, exception handling can be invoked for change process transitions that are associated with scores that exceed a particular threshold; exception handling is not invoked for change process transitions that do not exceed the particular threshold. A score “exceeding” a threshold refers to the score being greater or less than the threshold, depending on the implementation.
By employing the change process management according to some implementations, change process times can be reduced and be made more reliable. Human intervention can be reduced such that human errors resulting from such human intervention can be reduced. Also, by reducing human intervention, workforce efforts for managing change processes can he reduced, which can result in reduced workforce costs and improved change process throughput.
Mechanisms or techniques according to some implementations can be implemented in a system such as a system 300 depicted in
Machine-readable instructions of various modules described above (Including 206, 210, 218, 218, and 224 of
Data and instructions are stored in respective storage devices, which are implemented as one or more computer-readable or machine-readable storage media. The storage media include different forms of memory including semiconductor memory devices such as dynamic or static random access memories (DRAMs or SRAMs), erasable and programmable read-only memories (EPROMs), electrically erasable and programmable read-only memories (EEPROMs) and flash memories; magnetic disks such as fixed, floppy and removable disks; other magnetic media including tape; optical media such as compact disks (CDs) or digital video disks (DVDs); or other types of storage devices. Note that the instructions discussed above can be provided on one computer-readable or machine-readable storage medium, or alternatively, can be provided on multiple computer-readable or machine-readable storage media distributed in a large system having possibly plural nodes. Such computer-readable or machine-readable storage medium or media is (are) considered to be part of an article (or article of manufacture). An article or article of manufacture can refer to any manufactured single component or multiple components. The storage medium or media can be located either in the machine running the machine-readable instructions, or located at a remote site from which machine-readable instructions can be downloaded over a network for execution.
In the foregoing description, numerous details are set forth to provide an understanding of the subject disclosed herein. However, implementations may be practiced without some or all of these details. Other implementations may include modifications and variations from the details discussed above. It is intended that the appended claims cover such modifications and variations.
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/US2011/027648 | 3/9/2011 | WO | 00 | 8/27/2013 |