The present application generally relates to the field of monitoring and managing ongoing processes. More specifically, the present application relates to systems and methods for generating alert and diagnostic messages for the attention of human operators.
Systems that manage computer or network systems, or other systems with embedded computer technology, commonly monitor various system parameters for the purpose of detecting problems and alerting a human to the problem. Various techniques can be employed to monitor ongoing processes. The monitored values can be analyzed in various ways, including comparison with thresholds, correlation of several values, and correlation of values over time to discover problems, unprecedented situations, or other events.
Some systems use various techniques to predict events before they occur. One such system is described in commonly owned U.S. Patent No. 6,327,550, which is incorporated herein in its entirety by reference. In such systems one response to the discovery or prediction is to bring the event to the attention of a human operator. For example, these management systems can issue a text message alert and different techniques may be employed for presenting this text message to the operator, such as a Windows dialog box, monitoring consoles, event logs, email messages, or pager messages. The alert can also be provided as an audio message through loudspeakers, headsets, or a telephone. An example of a system that provides audio alert messaging is described in commonly owned, concurrently filed, co-pending U.S. Utility Application No. 10/091,067, entitled “Method and Apparatus for Generating and Recognizing Speech as a User Interface Element in Systems and Network Management”, the entirety of which is incorporated herein by reference. Commonly owned, concurrently filed, co-pending U.S. Utility Application No. 10/091,065. entitled “Method and Apparatus for Generating Context-Descriptive Messages”, is also incorporated by reference in its entirety.
In large management systems with many managed components and/or networks and a high level of activity, the management systems may generate a large number of alert messages. Some alert messages may be more important than others, but are typically issued because the alert functionality of such management systems is not open to modification. Other messages may be redundant because several management systems may independently detect the consequences of an event. As a result, current management systems include various techniques for filtering such alert messages based on various rules unrelated to the content of the message.
For example, some conventional management systems designate the severity of a detected or predicted event as the filtering rule. This permits the management system to present only critical messages, or messages about events above a certain level of severity. Other systems correlate alert messages over time or over several objects as a filtering rule. This permits the recognition that a message may indicate a critical problem, even though it may not indicate such criticality by itself, e.g., a minor error may be more critical if it occurs several times in a short time period.
Even after messages have been filtered so only meaningful messages remain, individual users may be interested in different categories of messages. Some management systems include various techniques for filtering alert messages presented to particular individuals, such as messages related to one or more groups of managed components or networks that denote some sort of business process. An example of such a management system is described in commonly owned U.S. Pat. No. 5,958,012, which is incorporated herein in its entirety by reference.
The present disclosure provides management systems and methods with improved alert messaging. The present disclosure also provides alert systems and methods capable of filtering alert messages generated by management systems to report operator desired messages. According to one embodiment, a method for reporting an alert condition is disclosed which includes defining alert filter criteria, identifying an alert condition and analyzing one or more properties of the alert condition and the alert filter criteria to determine whether or not to report the alert condition. The method further includes reporting the alert condition if the determination is to report the alert condition.
According to another embodiment, a system for reporting an alert condition is disclosed. The system includes a filter criteria maintenance module capable of maintaining filter alert criteria, an alert condition detector capable of identifying one or more alert conditions, an alert condition filter capable of filtering identified alert conditions based on the alert filter criteria, and an alert notification module for reporting the filtered alert conditions.
According to another embodiment, a system for reporting an alert condition is disclosed. The system includes means for maintaining filter alert criteria, means for identifying one or more alert conditions, means for filtering the one or more identified alert conditions based on the alert filter criteria and means for reporting the filtered alert conditions.
According to another alternative embodiment, a computer-readable storage medium is disclosed. The medium is encoded with processing instructions for reporting an alert condition, including instructions for defining alert filter criteria and instructions for identifying an alert condition. The medium also includes computer readable instructions for analyzing one or more properties of the alert condition based on the alert filter criteria and for determining whether to report the alert condition. The medium further includes instructions for selectively reporting the alert condition.
For a more complete understanding of the present methods and systems, reference is now made to the following description taken in conjunction with the accompanying drawings in which like reference numbers indicate like features and wherein:
An exemplary IT enterprise is illustrated in
The various components of an exemplary management system 100 topology that can manage an IT enterprise in accordance with the present disclosure are shown in
The visualization workstation 105 provides a user access to various applications including a network management application 115. Workstation 105 interacts with an object repository 110 which stores and delivers requests, commands and event notifications. Workstation 105 requests information from object repository 110, sends commands to the object repository, and gets notification of events, such as status changes or object additions from it. The object repository 110 receives request information from the management application 115, which is fed by the management agents 120 responsible for monitoring and managing certain components or systems in an IT enterprise.
The management application 115 maintains object repository 110, in part, to keep track of the objects under consideration. The object repository 110 may be a persistent store to hold information about managed components or systems, such as a database. In an alternative embodiment, the management application 115 and object repository 110 may be integrated into a single unit that can hold information about managed components in volatile memory and perform the tasks of the management application.
As shown, one architectural aspect of the present system is that in normal operation, the visualization workstation 105 interacts primarily with the object repository 110. This reduces network traffic, improves the performance of graphical rendering at the workstation, and reduces the need for interconnectivity between the visualization workstation 105 and a multitude of management applications 115, their subsystems and agents 120 existing in the IT enterprises. Of course, embodiments having other configurations of the illustrated components are contemplated, including a stand-alone embodiment in which the components comprise an integrated workstation.
In addition to handling requests, commands and notifications, object repository 110 may also handle objects describing the structure and operation of the management system 100. Such objects may describe the momentary state, load, and performance of the components and/or systems. Such objects may be populated using a manual process or an automatic discovery utility.
The alert filtering criteria may be, for example, the severity of an event, the relative importance of the object that exhibits the event, the urgency of the notification and the likelihood that the event condition will occur. To illustrate the interplay of severity, importance, urgency and risk, consider two potential problems associated with a personal computer and attached peripherals. The first potential problem might be an 80% likelihood that there will be a shortage of paper for the printer. The second potential problem might be a 2% likelihood that there will be a hard disk drive failure. The first problem has a high risk and low severity while the second potential problem has a relatively low risk but high severity. Determining the importance and urgency of each potential problem might require additional information regarding the use of the personal computer. For example, if a major use of a computer is printing, the urgency and importance the first problem might be higher. If the computer is primarily used for data storage, acting as a server for other computers running mission critical applications, the importance of the second problem might be higher, while the urgency might be moderate.
Referring now to
According to one embodiment, the alert filter criteria objects in database 210 direct the alert system 200 how to react to an alert message based on both the severity of an alert condition and on the importance of the affect of the alert condition on system or network component(s). The alert filter criteria objects in database 210 may further direct the system to take the urgency of an alert condition into account, and in the case of a prediction, the alert system 200 may take into account the level of risk.
Because the alert system 200 enables tracking the importance of objects, the severity and urgency of alert conditions and the risk for predicted alert conditions, the alert system 200 can use any or all of these four metrics to filter and report alert messages or notifications intelligently.
The alert system according to the present disclosure can use the level of importance of each object to facilitate context-based filtering. Instances may occur where the importance of an alert condition is not readily apparent from the object. For example, a database server may not always be mission-critical, and it may depend on whether the database server is being used by an application having an importance level of mission critical. Of course, human operators may know how the database server is being used and can manually enter the appropriate levels of importance for a particular server. Manual entry of importance levels, however, is cumbersome and inefficient, especially in situations where the relationships between different components may be indirect. For example, a database server may be shifted from a moderate level of importance to a mission critical level of importance. If the shift is not detected by the operator, the old lower importance level may inadvertently be retained. Another example of the inefficiency of manual entry of importance level can occur is where the traffic between an application running on a workstation and a database server depends on other network components, e.g., routers. In such situations, the database server, other network components, and the human operator may not be aware of such indirect relationships and consequently may neglect to manually adjust the importance levels.
Thus, according to one embodiment of the present system, if the management system can detect such dependency relationships, the importance rating can be influenced by dependencies.
The objects illustrated in
In the exemplary embodiment of
Alert filter criteria object 325 represents a filter rule associated with alerts affecting an accounts payable business division having a group ID “AP”. Alert filter criteria object 325 directs the system to report alerts affecting an AP function only if the alert has an importance level of “mission critical” or if the alert has an urgency level of 24 hours or less.
Referring again to
In one embodiment, alert condition detection module 220 assigns a severity property to each detected alert condition. This may be accomplished using the principles of the predictive management system described in commonly owned U.S. Pat. No. 6,327,550, which is incorporated herein by reference. Risk and urgency properties are also assigned. The urgency property may represent the amount of time remaining before action must be taken or it may represent a rating inversely related to the amount of time remaining. As previously described, module 220 also assigns an importance property to the alert condition object. The importance property represents a measure of the importance of the object, indicated, for example, along some suitable scale, such as 0-5.
The importance property may be determined in any of a number of ways. For example, the importance property may be manually assigned to each class of objects, so that each object of that class inherits the importance property of the class. Alternatively, classes of objects may be arranged in an inheritance hierarchy, so that a subclass (such as “NT server”) may inherit the importance rating of its parent class (such as “NT system”). In accordance with another example of importance level assignment, individual objects may be manually assigned an importance rating that overrides the rating of the class.
According to another way that the importance property may be assigned, various subsystems, such as for example a job scheduling system, may automatically set the importance properties of individual objects based on some suitable determination, overriding the rating of the class. In yet another example, the importance property may be propagated up a containment hierarchy based on some suitable algorithm. For example, importance may be propagated up based on the highest value among the contained components, so a component that contains several sub-components assumes the highest importance rating of the sub-components it contains.
The importance property may also be propagated along dependency relationships based on some suitable algorithm. For example, the importance property may be propagated along a “depends on” direction of a relationship with a “largest-value” aggregation function, so if an important application server depends on a database server, then the database server gets the same importance property as the application server unless some other propagation gives it a higher rating.
The detected alert condition objects of database 225 are referenced by alert condition filter module 230 and analyzed in accordance with the applicable alert filter criteria from database 210 to determine whether the detected alert condition qualifies to be reported to an operator. If alert condition filter module 230 determines that a detected alert condition merits reporting, the alert notification module 235 is directed to report the alert notification to an appropriate operator.
Referring to
Alert condition object 333 is an example of a potential or projected alert condition. Object 333 represents an alert condition in which there is a potential paper shortage for WSB printer. The importance level of the alert is “mission critical” because WSB printer 319 is used by the mission critical application 317 running on workstation B 313. Alert condition detection module 220 determined the severity to be high, due to the fact that while there is still available paper, the lack of paper would prevent the proper completion of Application B. Alert condition detection module 220 also determined that the likelihood that the condition will occur is 80% and that the urgency for this alert is “24 hours”. Due to the relationship of the WSB printer to the other components in the system. The group AR the is the only affected group.
As discussed above, the importance level of an object may be propagated along dependency relationships. The alert condition objects of
Of course, the use of severity and importance properties for filtering messages requires some care, when properties are propagated. Although the importance and severity properties illustrated herein have been qualitative, in an alternative embodiments such properties could be quantitative, e.g. numerical.
In such an embodiment, filter criteria maintenance module 205 may support filter criteria, and alert condition filter module 230 could filter alert condition objects, based on the sum or product of the numerical values representing severity and importance levels. As an example, consider a situation in which sub-network S1 contains computers C1 and C2. In this example, computer C1 is performing an important function and has a “very high” importancelevel, but a “normal” severity level. Computer C2 is a test system and has a “critical” severity level, but a “low” importancelevel. If severity and importance are independently propagated, then the subnetwork S1 would have both the “critical” severity and the “very high” importance levels, which might lead the message filtering system to report an alert for sub-network S1. However, if the important computer C1 is functioning properly, and the unimportant test computer C2 exhibits problems, there is no cause for concern regarding sub-network S1 and notification of the alert may not be needed. Therefore, according to an alternative embodiment of the present alert system, a filtering expression, such as, for example “severity+importance”, can be propagated separately so that alerts are reported for those alerts that meet this sum or difference filtering condition.
Referring now to
At block 410, an alert condition is detected. The alert condition may be an existing condition that requires operator attention, a warning regarding an existing condition or a predicted/potential condition that may require operator attention. Any technique known to those of skill in the art may be used in the detection of actual or potential alert conditions.
In addition to the detection of an alert condition, block 410 may also include use propagation algorithms to determine certain properties of the alert condition as represented by an alert condition object, such as for example, importance, severity, urgency and/or risk. In addition, associated identifiers such as, for example, an interest group identifier, may also be propagated to the alert condition object. The propagation may occur, for example, along dependency relationships or along containment relationships.
At block 415, the filter criteria associated with the detected alert condition are determined. The association between the filter criteria and the detected alert condition may be based on one or more elements such as, for example, interest group identifier, user identifier or, system component identifier. The association may be based on any factor that would be relevant in reporting the detected alert condition.
At block 420, the filter criteria is applied to the relevant properties of the detected alert condition. Based on the application of the filter criteria to the detected alert condition, a determination is made at block 425 whether or not to report the detected alert condition. If the properties of the detected alert condition fall within the alert filtering criteria, an alert notification is generated and output to an appropriate operator at block 430. Otherwise, the detection and filtering steps are repeated to continually report alert conditions as they arise.
Accordingly, it is to be understood that the drawings and description in this disclosure are proffered to facilitate comprehension of the methods and systems, and should not be construed to limit the scope thereof It should be understood that various changes, substitutions and alterations can be made without departing from the spirit and scope of the disclosed methods systems.
This application is a Continuation-In-Part of U.S. Ser. No. 09/949,101 filed Sep. 7, 2001, which is a Continuation of U.S. Ser. No. 09/408,213 filed Sep. 27, 1999 now U.S. Pat. No. 6,289,380 issued Sep. 11, 2001, which is a Continuation of U.S. Ser. No. 08/892,919 filed Jul. 15, 1997 now U.S. Pat. No. 5,958,012 issued Sep. 28, 1999. This application claims priority to U.S. Provisional Application Ser. No. 60/273,044 filed Mar. 2, 2001. The present application incorporates each related application by reference in its entirety.
Number | Name | Date | Kind |
---|---|---|---|
2485343 | Zuschlag | Oct 1949 | A |
3599033 | Stettiner et al. | Aug 1971 | A |
4464543 | Kline et al. | Aug 1984 | A |
4626892 | Nortrup et al. | Dec 1986 | A |
4665494 | Tanaka et al. | May 1987 | A |
4881197 | Fischer | Nov 1989 | A |
4937037 | Griffiths et al. | Jun 1990 | A |
4965752 | Keith | Oct 1990 | A |
4977390 | Saylor et al. | Dec 1990 | A |
5233687 | Henderson, Jr. et al. | Aug 1993 | A |
5261044 | Dev et al. | Nov 1993 | A |
5271058 | Andrews et al. | Dec 1993 | A |
5271063 | d'Alayer de Costemore d'Arc | Dec 1993 | A |
5295244 | Dev et al. | Mar 1994 | A |
5303388 | Kreitman et al. | Apr 1994 | A |
5353399 | Kuwamoto et al. | Oct 1994 | A |
5367670 | Ward et al. | Nov 1994 | A |
5394522 | Sanchez-Frank et al. | Feb 1995 | A |
5408218 | Svedberg et al. | Apr 1995 | A |
5440688 | Nishida | Aug 1995 | A |
5444849 | Farrand et al. | Aug 1995 | A |
5483631 | Nagai et al. | Jan 1996 | A |
5486457 | Butler et al. | Jan 1996 | A |
5495607 | Pisello et al. | Feb 1996 | A |
5500934 | Austin et al. | Mar 1996 | A |
5504921 | Dev et al. | Apr 1996 | A |
5509123 | Dobbins et al. | Apr 1996 | A |
5535403 | Li et al. | Jul 1996 | A |
5586254 | Kondo et al. | Dec 1996 | A |
5586255 | Tanaka et al. | Dec 1996 | A |
5631825 | van Weele et al. | May 1997 | A |
5634122 | Loucks et al. | May 1997 | A |
5650814 | Florent et al. | Jul 1997 | A |
5655081 | Bonnell et al. | Aug 1997 | A |
5666477 | Maeda | Sep 1997 | A |
5671381 | Strasnick et al. | Sep 1997 | A |
5682487 | Thomson | Oct 1997 | A |
5684967 | McKenna et al. | Nov 1997 | A |
5696486 | Poliquin et al. | Dec 1997 | A |
5696892 | Redmann et al. | Dec 1997 | A |
5699403 | Ronnen | Dec 1997 | A |
5745692 | Lohmann, II et al. | Apr 1998 | A |
5748098 | Grace | May 1998 | A |
5748884 | Royce et al. | May 1998 | A |
5751965 | Mayo et al. | May 1998 | A |
5761502 | Jacobs | Jun 1998 | A |
5768501 | Lewis | Jun 1998 | A |
5774669 | George et al. | Jun 1998 | A |
5787252 | Schettler et al. | Jul 1998 | A |
5793974 | Messinger | Aug 1998 | A |
5796951 | Hamner et al. | Aug 1998 | A |
5801707 | Rolnik et al. | Sep 1998 | A |
5802383 | Li et al. | Sep 1998 | A |
5805819 | Chin et al. | Sep 1998 | A |
5809265 | Blair et al. | Sep 1998 | A |
5812750 | Dev et al. | Sep 1998 | A |
5832503 | Malik et al. | Nov 1998 | A |
5857190 | Brown | Jan 1999 | A |
5867650 | Osterman | Feb 1999 | A |
5872911 | Berg | Feb 1999 | A |
5933601 | Fanshier et al. | Aug 1999 | A |
5941996 | Smith et al. | Aug 1999 | A |
5948060 | Gregg et al. | Sep 1999 | A |
5956028 | Matsui et al. | Sep 1999 | A |
5958012 | Battat et al. | Sep 1999 | A |
5963886 | Candy et al. | Oct 1999 | A |
5987376 | Olson et al. | Nov 1999 | A |
5991771 | Falls et al. | Nov 1999 | A |
6000045 | Lewis | Dec 1999 | A |
6008820 | Chauvin et al. | Dec 1999 | A |
6011838 | Cox | Jan 2000 | A |
6012984 | Roseman | Jan 2000 | A |
6021262 | Cote et al. | Feb 2000 | A |
6029177 | Sadiq et al. | Feb 2000 | A |
6035324 | Chang et al. | Mar 2000 | A |
6049828 | Dev et al. | Apr 2000 | A |
6052722 | Taghadoss | Apr 2000 | A |
6057757 | Arrowsmith et al. | May 2000 | A |
6058494 | Gold et al. | May 2000 | A |
6061714 | Housel, III et al. | May 2000 | A |
6070184 | Blount et al. | May 2000 | A |
6073099 | Sabourin et al. | Jun 2000 | A |
6085256 | Kitano et al. | Jul 2000 | A |
6094195 | Clark et al. | Jul 2000 | A |
6108782 | Fletcher et al. | Aug 2000 | A |
6112015 | Planas et al. | Aug 2000 | A |
6125390 | Touboul | Sep 2000 | A |
6131118 | Stupek et al. | Oct 2000 | A |
6141777 | Cutrell et al. | Oct 2000 | A |
6154212 | Eick et al. | Nov 2000 | A |
6154849 | Xia | Nov 2000 | A |
6161082 | Goldberg et al. | Dec 2000 | A |
6167448 | Hemphill et al. | Dec 2000 | A |
6185613 | Lawson et al. | Feb 2001 | B1 |
6192365 | Draper et al. | Feb 2001 | B1 |
6202085 | Benson et al. | Mar 2001 | B1 |
6209033 | Datta et al. | Mar 2001 | B1 |
6222547 | Schwuttke et al. | Apr 2001 | B1 |
6237006 | Weinberg et al. | May 2001 | B1 |
6260158 | Purcell et al. | Jul 2001 | B1 |
6271845 | Richardson | Aug 2001 | B1 |
6288650 | Chavand | Sep 2001 | B2 |
6298378 | Angal et al. | Oct 2001 | B1 |
6366284 | McDonald | Apr 2002 | B1 |
6373505 | Bellamy et al. | Apr 2002 | B1 |
6374293 | Dev et al. | Apr 2002 | B1 |
6404444 | Johnston et al. | Jun 2002 | B1 |
6421707 | Miller et al. | Jul 2002 | B1 |
6456306 | Chin et al. | Sep 2002 | B1 |
6546425 | Hanson et al. | Apr 2003 | B1 |
6577323 | Jamieson et al. | Jun 2003 | B1 |
6587108 | Guerlain et al. | Jul 2003 | B1 |
6603396 | Lewis et al. | Aug 2003 | B2 |
6614433 | Watts | Sep 2003 | B1 |
6639614 | Kosslyn et al. | Oct 2003 | B1 |
6661434 | MacPhail | Dec 2003 | B1 |
6704874 | Porras et al. | Mar 2004 | B1 |
6707795 | Noorhosseini et al. | Mar 2004 | B1 |
6711154 | O'Neal | Mar 2004 | B1 |
6732170 | Miyake et al. | May 2004 | B2 |
6738809 | Brisebois et al. | May 2004 | B1 |
6744446 | Bass et al. | Jun 2004 | B1 |
20010042118 | Miyake et al. | Nov 2001 | A1 |
20010044840 | Carleton | Nov 2001 | A1 |
20030046390 | Ball et al. | Mar 2003 | A1 |
20030069952 | Tams et al. | Apr 2003 | A1 |
20040210469 | Jones et al. | Oct 2004 | A1 |
20050078692 | Gregson | Apr 2005 | A1 |
Number | Date | Country |
---|---|---|
0 547 993 | Jun 1993 | EP |
0 936 597 | Aug 1999 | EP |
WO9527249 | Oct 1995 | WO |
WO 9704389 | Feb 1997 | WO |
WO9915950 | Apr 1999 | WO |
Number | Date | Country | |
---|---|---|---|
20030023722 A1 | Jan 2003 | US |
Number | Date | Country | |
---|---|---|---|
60273044 | Mar 2001 | US |
Number | Date | Country | |
---|---|---|---|
Parent | 09408213 | Sep 1999 | US |
Child | 09949101 | US | |
Parent | 08892919 | Jul 1997 | US |
Child | 09408213 | US |
Number | Date | Country | |
---|---|---|---|
Parent | 09949101 | Sep 2001 | US |
Child | 10091070 | US |