Embodiments of the invention generally relate to information technology, and, more particularly, to incident management.
Service requests can be reported by clients at a service desk through a web-based interface, electronic mail (e-mail), phone, etc. If the person manning the service desk cannot satisfy the request or cannot find the duplicate (incident) of it, an incident ticket is created. Incidents affect the normal running of an organization's information technology (IT) services (for example, service disruption, performance problems, etc.). Incident tickets usually involve structured information about the customer and an unstructured description of the problem, among other things.
Incident management processes can have various steps such as, for example, incident classification, incident routing, root-cause analysis, resolution and recovery. In existing approaches, these processes are largely manual, leading to delays and errors in incident resolution.
Principles and embodiments of the invention provide techniques for resolving incident reports by identifying system generated enterprise events related to the incident. An exemplary method (which may be computer-implemented) for correlating a client incident with one or more enterprise events to facilitate resolution of the incident, according to one aspect of the invention, can include steps of identifying one or more configuration items relevant to the one or more enterprise events, identifying one or more configuration items relevant to the client incident, and correlating the one or more enterprise events with the client incident using the one or more configuration items to facilitate resolution of the incident.
One or more embodiments of the invention or elements thereof can be implemented in the form of a computer product including a computer usable medium with computer usable program code for performing the method steps indicated. Furthermore, one or more embodiments of the invention or elements thereof can be implemented in the form of an apparatus or system including a memory and at least one processor that is coupled to the memory and operative to perform exemplary method steps. Yet further, in another aspect, one or more embodiments of the invention or elements thereof can be implemented in the form of means for carrying out one or more of the method steps described herein; the means can include hardware module(s), software module(s), or a combination of hardware and software modules.
These and other objects, features and advantages of the present invention will become apparent from the following detailed description of illustrative embodiments thereof, which is to be read in connection with the accompanying drawings.
Principles of the present invention include associating client incidents with system generated events. One or more embodiments of the invention include associating such incidents with events occurring in enterprise systems to help in more efficient incident resolution. Additionally, the techniques described herein also include identifying system events responsible for client incidents even if configuration items (CIs) affected by an event are not explicitly mentioned.
Clients can use natural language text to describe incidents. Incidents that are reported can be caused, for example, by changes or events generated in the enterprise. As described herein, events are captured and stored in an event monitoring and/or management system. One or more embodiments of the present invention correlate the client incidents with the enterprise events to facilitate resolution of the incidents. Such a correlation can be performed, for example, using a configuration management database (CMDB), which stores various configuration items (CIs) along with their inter-relationships. Client incidents and events are correlated with the relevant CIs in the CMDB, and these CIs, along with their relationships, are used to compute the correlation between events and client incidents.
The incident management process 202 depicted in
The aim of incident management is to quickly resolve incidents and restore the normal functioning of IT services. Incident investigation and diagnosis (for example, as depicted in step 210 of
As detailed herein, a system event can be defined as any significant change in the state of an enterprise system resource, network resource, or network application. For example, one can generate an event for a problem, for the resolution of a problem, or for the successful completion of a task.
Examples of events can include normal starting and stopping of a process, abnormal process termination, server malfunction, etc. Events are generated and managed by various event management systems.
Generally, events have attributes such as date, host, class-name (indicating event-type), and some class specific attributes. By way of example, event attributes may include attributes similar to the following.
Log-entry: Nov 7 08:51:42 oak su: ‘su root’ failed for don on/dev/ttyp0
As detailed herein, for a service request, one should determine if the service request is duplicate of other existing incident. If it is not a duplicate, one should create an incident for the input service request. Also, one should determine the likely cause of (client) incidents (for example, the particular system event(s)). If one can pin-point the event which is causing the incident, event resolution can resolve the incident as well. If an event is already assigned for diagnosis, then a customer may be told so. Notification can also be sent to customers not to report incidents of certain kind to avoid flooding of the service desk.
Further, one or more embodiments of the invention identify the event(s) responsible for an incident via the use of a configuration management database (CMDB). A CMDB can include a repository of information about machines, hardware, software, people, etc. in an enterprise and relationships between them. All of these objects are referred to as configuration items (CIs) having object class (for example, Computer Machine), attribute name-value pairs (Memory-size=2 gigabytes (GB)), attribute name-value pairs, sub-objects (Operating System), etc. Various configuration items can be related using explicit and/or implicit relationships. Example of relationships may include “a database installedOn a computer,” “an enterprise application runsOn a webserver,” etc. Relationships between CIs can be forward as well as backward (for example, installedOn, runsOn, etc.). Relationships between CIs can be used to find dependent hardware/software objects for a given hardware/software component (CI).
A CMDB can be accessed, for example, using Java application programming interfaces (APIs) to perform various tasks such as adding and deleting CIs, searching for CIs belonging to a particular type (for example, WebSphere Servers), browsing through Relationships, etc. A CMDB can also have structured query language- (SQL)-like language to search (for example, to get names of all computer-systems having more than two central processing units (CPUs)). Additionally, discovery mechanisms can be used to populate a CMDB automatically.
As part of incident-CI association, one or more embodiments of the invention use incident description to extract keywords. These keywords are searched over a configuration management database (CMDB) to obtain associated CIs. A Lucene based search engine and other search techniques can be employed to make this search efficient and more accurate. All (unresolved) events can be obtained from the event server, and these CIs associated with the incident along with their relevancy scores can be denoted as ICI1, ICI2, . . . , ICIn and S1, S2, Sn, respectively. One or more CIs are associated with every event using an event-search. A CI associated with the event is usually explicitly mentioned in the events or can be searched using the same technique as used for client reported incidents. For obtaining an event CI, event attributes and/or description can be searched over a CMDB and a resultant CI can be obtained.
A CI comparator can take input such as, for example, results of an incident-search (for example, a list of CIs with their relevancy scores (denoted as ICI1, ICI2, . . . , ICIn and S1, S2, . . . , Sn, respectively)) as well as results of an event-search (for example, a CI (such as, for example, ECI)). Also, a CI comparator can include output such as, for example, a list of events with their relevance score.
A list of events correlated with the client incident can be obtained by comparing an incident search result (ICIs) with the event search result (ECI or explicitly mentioned CI) using various intuitions including:
Additionally, in one or more embodiments of the invention, wherein for each event Ek, one can get the relationship-graph (template) affected by the event. This graph may include all CIs which have a backward relationship with ECIk, directly or indirectly. To limit CIs affected by the event, one or more embodiments of the invention associate a template with each event class which defines CIs affected by any event of that class. A template is in the form of a tree rooted at the ECI class, with nodes being classes of other CIs dependent on ECI and edges being CMDB relationships. One can assign non-zero scores to all the events whose relationship-graph includes any ICI. As such, the score can be inversely proportional to the time difference between event arrival and incident arrival if the time different is positive, and is proportional to a maximum score of ICIk (max Sk) appearing in its relationship graph.
Correlating the enterprise event(s) with the client incident can include, for example, using a database to identify configuration items responsible for the enterprise event(s) and the incident. Additionally, identifying the enterprise events and configuration items (CIs) can include, for example, using a database (such as, for example, a CMDB) to identify the configuration items (CIs) responsible for enterprise events and user incidents. A CMDB can, by way of example, store CIs and their inter-relationships.
The techniques depicted in
A variety of techniques, utilizing dedicated hardware, general purpose processors, software, or a combination of the foregoing may be employed to implement the present invention. At least one embodiment of the invention can be implemented in the form of a computer product including a computer usable medium with computer usable program code for performing the method steps indicated. Furthermore, at least one embodiment of the invention can be implemented in the form of an apparatus including a memory and at least one processor that is coupled to the memory and operative to perform exemplary method steps.
At present, it is believed that the preferred implementation will make substantial use of software running on a general-purpose computer or workstation. With reference to
Accordingly, computer software including instructions or code for performing the methodologies of the invention, as described herein, may be stored in one or more of the associated memory devices (for example, ROM, fixed or removable memory) and, when ready to be utilized, loaded in part or in whole (for example, into RAM) and executed by a CPU. Such software could include, but is not limited to, firmware, resident software, microcode, and the like.
Furthermore, the invention can take the form of a computer program product accessible from a computer-usable or computer-readable medium (for example, media 818) providing program code for use by or in connection with a computer or any instruction execution system. For the purposes of this description, a computer usable or computer readable medium can be any apparatus for use by or in connection with the instruction execution system, apparatus, or device.
The medium can be an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system (or apparatus or device) or a propagation medium. Examples of a computer-readable medium include a semiconductor or solid-state memory (for example, memory 804), magnetic tape, a removable computer diskette (for example, media 818), a random access memory (RAM), a read-only memory (ROM), a rigid magnetic disk and an optical disk. Current examples of optical disks include compact disk-read only memory (CD-ROM), compact disk-read and/or write (CD-R/W) and DVD.
A data processing system suitable for storing and/or executing program code will include at least one processor 802 coupled directly or indirectly to memory elements 804 through a system bus 810. The memory elements can include local memory employed during actual execution of the program code, bulk storage, and cache memories which provide temporary storage of at least some program code in order to reduce the number of times code must be retrieved from bulk storage during execution.
Input and/or output or I/O devices (including but not limited to keyboards 808, displays 806, pointing devices, and the like) can be coupled to the system either directly (such as via bus 810) or through intervening I/O controllers (omitted for clarity).
Network adapters such as network interface 814 may also be coupled to the system to enable the data processing system to become coupled to other data processing systems or remote printers or storage devices through intervening private or public networks. Modems, cable modem and Ethernet cards are just a few of the currently available types of network adapters.
In any case, it should be understood that the components illustrated herein may be implemented in various forms of hardware, software, or combinations thereof, for example, application specific integrated circuit(s) (ASICS), functional circuitry, one or more appropriately programmed general purpose digital computers with associated memory, and the like. Given the teachings of the invention provided herein, one of ordinary skill in the related art will be able to contemplate other implementations of the components of the invention.
At least one embodiment of the invention may provide one or more beneficial effects, such as, for example, identifying system events responsible for client incidents and configuration items (CIs) affected by an event if it is not explicitly mentioned.
Although illustrative embodiments of the present invention have been described herein with reference to the accompanying drawings, it is to be understood that the invention is not limited to those precise embodiments, and that various other changes and modifications may be made by one skilled in the art without departing from the scope or spirit of the invention.