The entire contents of India Application No. 3352/MUM/2014, filed on Nov. 20, 2014 are incorporated herein by reference.
This disclosure relates generally to computing devices, and more particularly to methods and systems for transition of information technology (IT) operations from one service provider to another service provider.
Information Technology (IT) of today's business is managed by IT service providers. These service providers perform a wide variety of tasks to ensure smooth running of business operations such as monitoring of application and infrastructure, attending service requests from users, managing changes, debugging performance problems, among others. The service providers perform these operations through teams of resolvers. These teams consist of a mix of people of different competencies and different levels of expertise. These teams very well understand the business operations, dependency of business on underlying IT, the frequently seen problems and their fixes. In an event of transition, where a new service provider takes over the operations from the existing service provider, capturing all this knowledge is a daunting task. As a result, it is absolutely critical to design crisp and comprehensive plan for transition that ensures complete, risk-free, and timely transition.
The inventors here have recognized several technical problems with such conventional systems, as explained below. A majority of existing solutions for transition of IT operations relies on manual, intuition-driven and experience-centric approach. Teams from the two service providers work towards the process of knowledge transfer. The existing team identifies important knowledge items for transition and trains the new team. The new team identifies domain experts that can take the transition.
However, the new team has no or minimal knowledge of activities performed by the existing team before transition. This approach is ad-hoc and entirely relies on the human expertise. It is risky and cannot cater to today's scale and rapidly changing environments.
Prior art literature have illustrated various experience centric approaches for transition of IT operations, however, analytics centric IT operations is still an unexplored dimension of transition planning.
Embodiments of the present disclosure present technological improvements as solutions to one or more of the above-mentioned technical problems recognized by the inventors in conventional systems. For example, in one embodiment, a processor-implemented method, for transition of Information Technology.
Before the present methods, systems, and hardware enablement are described, it is to be understood that this invention is not limited to the particular systems, and methodologies described, as there can be multiple possible embodiments of the present invention which are not expressly illustrated in the present disclosure. It is also to be understood that the terminology used in the description is for the purpose of describing the particular versions or embodiments only, and is not intended to limit the scope of the present invention which will be limited only by the appended claims.
The present application provides a method and system for comprising a processor configured to identify, one or more issues which occur during the transition of IT operations from service provider to another service provider. The method provides a computer implemented method for transition of Information Technology (IT) operations; comprising a processor configured to identify, one or more issues which occur during the transition of IT operations from service provider to another service provider; the processor is further configured to identify, one or more heavy hitter issues from the one or more issues. In one embodiment the one or more heavy hitter issues are issues which cover a maximum workload volume of IT operations. The processor is further configured to identify, risk associated with the one or more issues, by determining severity of the one or more issues. In an embodiment the severity of the one or more issues is based on the instability caused by the issue or the penalties associated with the issues. Further the processor is configured to identify, one or more similar issue communities. In an embodiment the one or more similar issue communities are determined by computing a similarity coefficient and constructing an issue similarity graph. The processor is further configured to identify, an optimal team size of resolvers. In an embodiment the optimal team size is determined by profiling type of activities performed by the resolvers; and further the processor is configured to implement, a transition plan wherein in an embodiment the transition plan is derived by analyzing identified heavy hitter issues, risk associated with the one or more issues, one or more similar issue communities and optimal team size of resolvers for transition of IT services.
In yet another embodiment, the present application discloses a system for efficient transition of Information Technology (IT) operations. The system comprises a processor and a memory comprising an issue identification module configured to identify, one or more issues which occur during the transition of IT operations from service provider to another service provider. The system further comprises a coverage maximization module configured to identify, one or more heavy hitter issues from the one or more issues. In an embodiment the one or more heavy hitter issues are issues which cover a maximum workload volume of IT operations. The system further comprises a risk minimization module configured to identify, risk associated with the one or more issues, by determining severity of the one or more issues. In an embodiment the severity of the one or more issues is based on the instability caused by the issue or the penalties associated with the issues. The system further comprises a time minimization module configured to identify one or more similar issue communities. In an embodiment the one or more similar issue communities are determined by computing a similarity coefficient and constructing an issue similarity graph. The system further comprises a cost identification module configured to identify, an optimal team size of resolvers. In an embodiment the optimal team size of resolvers is determined by profiling type of activities performed by the resolvers. Further the system comprises a transition planning module configured to implement a transition plan wherein the transition plan is derived by analyzing identified heavy hitter issues, risk associated with the one or more issues, one or more similar issue communities and optimal team size of resolvers for transition of IT services.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the invention, as claimed.
The accompanying drawings, which are incorporated in and constitute a part of this disclosure, illustrate exemplary embodiments and, together with the description, serve to explain the disclosed principles.
Exemplary embodiments are described with reference to the accompanying drawings. In the figures, the left-most digit(s) of a reference number identifies the figure in which the reference number first appears. Wherever convenient, the same reference numbers are used throughout the drawings to refer to the same or like parts. While examples and features of disclosed principles are described herein, modifications, adaptations, and other implementations are possible without departing from the spirit and scope of the disclosed embodiments. It is intended that the following detailed description be considered as exemplary only, with the true scope and spirit being indicated by the following claims.
The present application provides a computer implemented method and system for efficient transition .of Information Technology (IT) operations from one service provider to another. More particularly, the system and method facilitates efficient, analytics based transition of IT operation from one service provider to another.
Referring now to
In one implementation, the network 106 may be a wireless network, a wired network or a combination thereof. The network 106 can be implemented as one of the different types of networks, such as intranet, local area network (LAN), wide area network (WAN), the internet, and the like. The network 106 may either be a dedicated network or a shared network. The shared network represents an association of the different types of networks that use a variety of protocols, for example, Hypertext Transfer Protocol (HTTP), Transmission Control Protocol/Internet Protocol (TCP/IP), Wireless Application Protocol (WAP), and the like, to communicate with one another. Further the network 106 may include a variety of network devices, including routers, bridges, servers, computing devices, storage devices, and the like.
In one embodiment, referring to
In one embodiment, referring to
The present application provides a computer implemented methods and system for transition of information Technology (IT) operations from one service provider to another service provider.
The following detailed description uses the certain words which are defined hereunder
Issue: An issue refers to one specific type of problem addressed by a resolver. Some examples of issue could be file system is full; CPU utilization is high etc.
Resolver: A resolver refers to personnel in the service provider team that resolves the issue. Different resolvers are of different expertise and experience. Based on the experience, the resolvers are assigned to levels L1, L2, and L3, where L3 resolvers are the most experienced.
Ticket: A ticket refers to one specific instance of an issue. Ticket contains details of the issue, the inventory time, time-stamp, resolver, resolution steps, etc.
Inventory item: An inventory item refers to a hardware or software component where the issue is observed. Some examples of inventory items are specific application, database, server, network switch, etc.
The process of transition needs to address both IT systems as well as human systems. The IT systems involve the inventory of software and hardware components, the issues related to these components, knowledge of resolution of these issues, criticality of components with respect to business operations, service level agreements on performance of applications, etc. The human systems, on the other hand, involve the resolvers, their competencies, their experience, details of shift formations, etc.
In an example, an IT system of a banking application performs end of the day batch operations. Such systems typically consist of a set of batch jobs (applications) hosted in a mainframe environment. Various issues occur in this system such as job failure, job not starting in time, job taking too long to execute, file-system full, database connection failure, etc. These issues are resolved by a team of resolvers. Each issue requires a specific process of resolution e.g. in case where a job is running for a long time then the resolver checks for its root-cause such as unavailability of CPU or memory, job waiting for completion of other jobs, job waiting for third party feeds, etc. Based on the root-cause, the resolver takes appropriate actions to fix the problem such as preempting a resource, ensuring the third-party dependencies are met, etc.
As part of transition, one needs to understand various aspects of this environment such as the business-critical jobs, the issues observed in these jobs, frequent root-causes of these issues, their resolution procedure, etc. During a transition, a new team with domain expertise is employed, however the new team has no or minimal knowledge of activities performed by the existing team before the transition. This approach is ad-hoc, depends on human expertise and may prove risky and ineffective to cater to today's scale and rapidly changing environments.
In accordance with one embodiment of the analytics driven method for transition disclosed herein, the knowledge of IT and human system are profiled before transition so that a well-planned and informed transition can take place. A large amount of information about IT operations is logged in various data sources. Mining this information can provide many useful insights about the operations. In an example, all issues generated in an IT system are logged in the form of tickets. Each ticket contains attributes such as a problem description, ticket creation time, severity of the ticket, the IT component where the problem is observed, resolver that resolved the issue, the time to resolve the issue, the time to resolve the issue, details of the problem resolution etc.
This data may be mined and analytics may be applied to identify one or more heavy hitter issues that cover majority of the workload volume, one or more high risk issues that should be transitioned first, identifying which issues have similar resolution steps and may be transitioned together, matching resolvers with the issues of their expertise to get the right issues to the right resolvers for effective resolution and estimate the optimal number of people to best meet the workloads and the service level agreements (SLA's).
In an implementation, various entities and relationships involved in the IT operations can be effectively modeled as a graph. For instance, the inventory items, issues generated from these items, and the resolvers resolving these issues can be modeled as nodes of a graph. Edges between nodes indicate various relationships. For instance, an edge between issue and inventory item indicates the source of origin of the issue. An edge between issue and resolver indicates the resolution expertise. Each node and edge can be associated with various attributes such as risk, persistence, recency, volume, among others.
In an embodiment various problems of transition planning may be mapped to well-defined problems in graph theory and set theory. In an embodiment these well-defined problems of graph theory and set theory include 1) Maximize coverage, 2) Minimize risk, 3) Minimize time of transition, 4) Minimize cost. A detailed working of solutions for the abovementioned problems is explained in the following paragraphs.
In one embodiment, in order to maximize coverage, typical IT operations involving a variety of issues generated from various IT components such as applications database, file system etc. are considered. However in most cases a low percentage of issues (to the amount of 20%) cover a high percentage of the workload volume (to the amount of 80%) and hence identification of these 20% issues (referred to as heavy hitter issues) may lead to ensuring that maximum workload volume is covered at any given time. Thus, each issue may be evaluated on various criteria such as volume, persistence, regency and on various criteria such as volume, persistence, regency by using well known Borda count techniques to identify the heavy hitter issues.
In an embodiment the coverage maximization module (210) may be configured to identify heavy hitter issues based on the following three criteria,
1) Volume: Each issue may be assigned a score mnCv based on the number of tickets generated for that issue. The issue (s) with the smallest volume is assigned score=1 and subsequent issues are assigned 2, 3 onwards. 2) Persistence: A persistence score Cp is assigned based on the number of days for which the issue occurred. 3) Recency: The issues with recent occurrences are more important than an old issue a recency score Cr is assigned to an issue based on the last timestamp of the occurrence of the issue.
In an embodiment a coverage maximization module (210) consolidates the score generated based on all three criteria using the Broda count as per the following equation (1)
Ccoverage= (1)
where α, β, and γ are the normalization constants. The normalization constants are used to ensure that ranks from all three criteria are in the same range. Further the coverage maximization module (210) may be configured to calculate a consolidated score and identify top issues as heavy hitter issues.
All scores are normalized to a range of 0 to 1, and the consolidated score Ccoverage is computed as an average of the normalized scores. Based on the three criteria of volume, persistence and recency, issue I1 is assigned the highest score of 0.92 because issue I1 is highest in volume and recency and second highest in persistence.
In an embodiment, since different issues have different levels of risk, for example a critical application issue can lead to higher penalties. In an embodiment in order to identify issues with higher risk the risk minimization module (212) is configured to: 1) define the risk of an entity based on the number of entities referring to it. For instance, an issue generated from too many applications should be considered riskier than the one generated from fewer applications. The risk minimization module (212) is configured to: 2) define the risk of an entity based on the risk of the entities referring to it. For instance, an issue generated from high risk or critical applications should be considered riskier than the one generated from low-risk applications.
In an embodiment the above mentioned approach may be implemented by configuring the risk minimization module (212) to, first create a graph of all issues and their related entities. Entities may be any application or human resource, location, hosts etc. connected to one issue, wherein one entity may be connected to one or more issues.
The risk minimization module (212) may further be configured to construct an issue graph such that, a node is created for each issue, for each issue, nodes are created for all relevant entities, edges are created between issue and its related entities, edges are created between related entities. In one example an edge between a location and a host refers to the location where the host machine is deployed.
In an embodiment, in order to minimize risk during transition of IT operations risk is computed for each entity based such that risk of some entities may be computed based on domain knowledge. For example business critical jobs may be assigned higher risk value, locations may be assigned risk scores based on the business needs. Similarly severity of high, medium, low are assigned predefined risk values.
Further risk of some entities may be derived from other entities. Also risk of an issue may be derived based on the entities associated with the issue.
In an embodiment, the risk minimization module (212) is configured such that it may identify two types of entities. Entities which indicate highest risk for entities with large number of instances, i.e. the issues coming from a large number of applications are riskier than the one from fewer applications. The risk minimization module (212) is configured to compute risk of an entity E, where high instances indicate high risk on equation (2)
Further, risk minimization module (212) is configured to identify entities where fewer instances indicate highest risk (e.g.: Resolver i.e. an issue known to very few resolvers is riskier than the issues known to all the resolvers.) The risk minimization module (212) is configured to compute risk of such an entity E for issue i based on equation . . . (3):
Further, the risk minimization module (212) is configured to compute final risk value for an issue as the sum of the normalized risks calculated from all mentioned items above by using the equation - - - (4).
Kresolver=ΣeεEntitiesKie (4)
In the above mentioned equations (2), (3) and (4), E(i) refers to the instances of entity E connected to issue i, and E(all) refers to all instances of that entity. K{e) refers to the risk associated with instance e. The above equation is thus the ratio of sum of risk of all instances associated with issue i and the sum of risk of all instances.
Further to calculate risk where the issue is resolved by only 1 resolver out of 5 resolvers. Assuming a risk of 1 for all resolvers, the resolver risk is calculated as per equation (6)
In an embodiment in order to minimize IT operations transition time, the Time minimization module (214) is configured to identify (a) similar issues for simulations transition, and (b) communities of dissimilar issues for parallel transition. The time minimization module configured to identify two issues as similar if, the issues have similar resolution steps, issues that are generated from the same set of inventory items, issues that are resolved by the same set of resolver and issues that always co-occurs.
In order to identify similarity between issues on the basis of attribute, in an embodiment an attribute AV and two issues Ii and Ij may be considered in an IT environment such that the value of AV for Ii and Ij may be AVi, and AVj. The time minimization module is configured to identify similarity coefficient between two sets of attributes by using approaches such as Jaccard coefficient, Dice coefficient and the like. In an embodiment the attribute of commonality of elements is given twice the weight of other attributes. The Dice coefficient between two sets AVi and AVj is computed as per equation (7)
In an embodiment, the time minimization is configured to construct an issue similarity graph in order to identify communities an issue-similarity graph is build. In an embodiment for constructing the issue similarity graph, a node is constructed for each issue and an edge is inserted between two issues when their similarity coefficient is greater than a predefined threshold. The issue similarity graph is analyzed to identify communities of similar issues.
In an embodiment, the time minimization module (214) communities may be built using connected components, strongly connected components or cliques. Identification of maximum cliques in the issue-similarity graph is proposed in accordance with the present subject matter. That is, maximum cliques ensure maximum number of similar issues.
Due to the compute intensive nature of exploration of cliques, in an embodiment, the time minimization module (214) may be configured to identify the potential size of cliques by using the following set of rules derived from graph theory and set theory.
The size of maximum clique, sizemaxClique cannot be larger than (degreemax+1), where degreemax the maximum degree of the graph.
The graph must have x nodes with degree (x−1) to contain a clique of size x, i.e. in an example, a clique of size 5 can exist if and only if there exist at least 5 nodes with degree 4.
In an embodiment, the Time minimization module (214) may be configured such that once the potential size of the cliques present in the graph are identified, the issues forming the maximum cliques are removed iteratively in a removed graph. Each clique identified by the iterative process is a community of similar issues.
Referring to
In order to minimize cost of the IT operations transition in an embodiment of the subject matter disclosed herein, a cost minimization module (216) may be configured to identifying the right size of resolver team for minimizing cost. In an embodiment, the cost minimization module (216) may be configured such that the effort required in resolution of an issue is estimated by using historically logged attributes of issues. In an embodiment this logged information may include information form tickets logged for resolution of each issue in the IT operations system. The effort required to resolve an issue may be used for computing an amount of time a resolver needs to resolve an issue and a minimum number of required resolver to resolve an issue.
In an embodiment, the cost reduction module (216) is configured to employ a Minimum Bin packing algorithm to compute the minimum number of resolvers for an issue, wherein the Minimum Bin packing algorithm may be defined such that a bin “S” of size “V” and a list of n issues a1, . . . , an is provide where the smallest number of bins “B” and a B-partition S1 . . . SB of the set {1, . . . , n} such that for all k=1, . . . B is to be determined.
In an embodiment, resolvers may spend ei effort on an issue Ii for H hours such that ei is mapped to an issue ai and the H hours are mapped to a bin size V. Further minimal bin packing is applied to identify the smallest number of bins in which all the issues can be packed. The number of bins so computed provide the smallest number of resolvers that can cater to the workload of the issues.
Further in another embodiment, computing the minimum number of resolvers may cater to addition constraints such as resolver redundancy, i.e., backup resolvers may be made available for important issues by making sure that resolution of an issue is known to “k” resolvers. For this the cost reduction module may be configured to split an issue Ii with effort ei may be into “k” sub issues each with the effort eki and these sub-issues are packed into bins while insuring that a same issue is not assigned to a same resolver twice.
In yet another embodiment, the additional constraint may include that a resolver can only be assigned a predetermined maximum number of issues. The assignment of a predefined maximum number of issues constraint may be implemented in an embodiment by limiting the maximum number of issues that may be packed in a bin.
In an exemplary embodiment of the disclosed subject matter shown in
In an embodiment of the disclosed subject matter, the transition planning module (218) is configured to compute and estimate the results from the other modules for the purpose of maximizing coverage, minimizing risk, minimizing time and minimizing cost are consolidated to generate a transition planner.
Although in an ideal setting, each community of dissimilar issue identified for minimizing time may be transitioned in parallel, i.e. be resolved by a set of resolver who are expert in the issues, however in an embodiment wherein the number of resolvers is limited it may not be possible to resolve each set of community in parallel.
In an embodiment where the number of resolvers are limited a transition planning module (218) may be configured to include identification of a smallest set of resolvers (Minimum resolver set) that can give transition for all issues in a community. The minimum resolver set may be identified by reducing the minimum resolver set to a Minimum hitting set problem wherein for a collection C of subset of a finite set S, a hitting set H⊂S of least cardinality for C is identified such that A set H⊂S is a hitting set of C if it contains at least one element from each subset in C. A set of resolvers may be constructed for each issue, where the set contains resolvers who are expert in resolving the each issue.
In an embodiment,
Further in another embodiment, the minimum hitting set may be so identified that that maximum parallelization of transition is ensured, i.e., resolver set of communities may be identified such that the resolver such least overlap with each other. In an embodiment, in order to achieve maximum parallelization a weighted minimum hitting set may be computed such that the weight of 0 to each resolver is initialized. The weight of each resolver that is selected for a community is then increased and while selecting the minimum hitting set, a set of resolvers with least weight is selected. The weighted minimum hitting set may be computed such that the communities are ranked based on the rank of the coverage and risk of their constituent issues. A community with the highest rank is identified and a weighted minimum resolver set is selected. Weight of the resolver set is increased and the steps are repeated until all communities are covered.
Further in another embodiment, once the communities of issues and the required set of resolvers are identified the groups of communities which do not require any common resolvers are identified next so that each such group can take transition in parallel and hence can form a single stage in the transition plan.
Further in an embodiment, the various stages of transition of IT operations may be identified such that a graph with each node as a community and an edge representing one or more common resolvers between the communities is constructed. The disconnected communities are identified by identifying a minimum vertex cover within the constructed graph. Any communities not forming the vertex cover may be identified as independent and may form a single stage of transition. The communities identified as single stage of transition are removed from the community graph, and the process of identifying disconnected communities is repeated until all the communities are covered.
Further in another embodiment, a rank of each transition stage is computed based on the rank of coverage and risk of each community within a stage. This rank is used to prioritize stages for transition such that preference may be given to a community with higher coverage and risk.
In an embodiment illustrated in
Although embodiments for methods and systems for the present subject matter have been described in a language specific to structural features and/or methods, it is to be understood that the present subject matter is not necessarily limited to the specific features or methods described. Rather, the specific features and methods are disclosed as exemplary embodiments for the present subject matter.
The process starts at 1002, wherein the coverage maximization is performed wherein coverage maximization ensure that most, if not all, issues are covered by identifying heavy hitter issues, such that heavy hitter issues are issues that cover high volume issues, based on various criteria and implementing Borda count.
At 1004, issues which are not heavy hitter issues but have a high risk associated with them are identified by computing the risk associated with all the issues. Risk values may be propagated by making use of page rank algorithm and further identifying and prioritizing high risk issues.
At 1006, in order to minimize the time for transition of IT operations a similarity between issues is computed and issue similarity graphs are constructed, the communities of issues are then identified by identifying the maximum cliques in the graph. Thereafter similar issues are transitioned simultaneously while dissimilar issue communities undergo parallel transition.
At 1008, resolvers are profiled based on the type of issues resolved by them and calculate an optimal team size in order to minimize cost, by implementing the Minimal Bin packing problem. Further the Minimum Bin packing solution may be customized to introduce constraints such as ensuring redundancy among resolvers, avoiding overloading a resolver with too many tasks etc.
The process ends at 1010, wherein the recommendations from the four objectives are combined by using a transition planner. The transition planner carefully plans, transition tracks and prioritizes the issues based on the recommendation generated at the previous steps. Further minimum hitting set and minimum vertex cover problems may be implemented for constructing multiple sub-stages of the transition plan.
It will be appreciated by a person skilled in the art that although the method of
The illustrated steps are set out to explain the exemplary embodiments shown, and it should be anticipated that ongoing technological development will change the manner in which particular functions are performed. These examples are presented herein for purposes of illustration, and not limitation. Further, the boundaries of the functional building blocks have been arbitrarily defined herein for the convenience of the description. Alternative boundaries can be defined so long as the specified functions and relationships thereof are appropriately performed. Alternatives (including equivalents, extensions, variations, deviations, etc., of those described herein) will be apparent to persons skilled in the relevant art(s) based on the teachings contained herein. Such alternatives fall within the scope and spirit of the disclosed embodiments. Also, the words “comprising,” “having,” “containing,” and “including,” and other similar forms are intended to be equivalent in meaning and be open ended in that an item or items following any one of these words is not meant to be an exhaustive listing of such item or items, or meant to be limited to only the listed item or items. It must also be noted that as used herein and in the appended claims, the singular forms “a,” “an,” and “the” include plural references unless the context clearly dictates otherwise.
Furthermore, one or more computer-readable storage media may be utilized in implementing embodiments consistent with the present disclosure. A computer-readable storage medium refers to any type of physical memory on which information or data readable by a processor may be stored. Thus, a computer-readable storage medium may store instructions for execution by one or more processors, including instructions for causing the processor(s) to perform steps or stages consistent with the embodiments described herein. The term “computer-readable medium” should be understood to include tangible items and exclude carrier waves and transient signals, i.e., be non-transitory. Examples include random access memory (RAM), read-only memory (ROM), volatile memory, nonvolatile memory, hard drives, CD ROMs, DVDs, flash drives, disks, and any other known physical storage media.
It is intended that the disclosure and examples be considered as exemplary only, with a true scope and spirit of disclosed embodiments being indicated by the following claims.
Number | Date | Country | Kind |
---|---|---|---|
3352/MUM/2014 | Nov 2014 | IN | national |