The present invention generally relates to a cause analysis in a computer, and more specifically, to a cause analysis accompanying a change in a configuration for finding a solution to an application failure without using a knowledge database by analyzing the configuration change.
One of the most stressful jobs for an administrator of a desktop computing environment is the cause analysis (troubleshooting) of a case in which trouble has occurred. Cause analysis is also a deciding factor for help desk personnel, who must provide a caller with a solution. An end user tends to install numerous types of software and change OS settings, and this can create problems. In addition, the configuration of a computer can be changed without the end user realizing it due to the numerous upgrade programs that routinely run on the end user's computer. Therefore, there are cases where the end user does not know when a configuration fault occurred, and cannot recall when a problem arose. The administrator or help desk personnel of this type of desktop computing environment must use their specialized knowledge and deep understanding of what is behind the trouble to provide the end user with a solution.
Current solutions, for example, include technology for remotely examining event logs inside a computer, technology for collecting and storing configuration items and their change history, technology for detecting the invocation of an application and storing the invocation history, technology for storing knowledge about past solutions, and technology for deducing a root cause by combining the above-mentioned information.
Patent Literature 1 is an example of deducing a root cause using an event log, a configuration change, and a knowledge database. Paragraph 0134 discloses the collection of an error log, event information, and chronological data on configuration changes from a target monitoring computer. Paragraph 0137 and
Examples of the remote collection of event logs include Patent Literature 2 and Patent Literature 3. Patent Literature 4 is an example of fault analysis using a knowledgebase. Patent Literature 5 is an example of fault analysis using an error log. Patent Literature 6 is an example of the remote acquisition of a configuration change.
Either the administrator or the help desk personnel must possess knowledge of event logs, configuration change histories, and application invocation histories, and the know how to provide a solution by carefully examining these types of information. Knowledge can be obtained from a knowledge database that provides a “cause” and a “solution” described by another person. Someone must keep the knowledge database up-to-date, and this requires maintenance fees.
Another task of the present invention is to provide information for analyzing a software problem related to a configuration change without using the knowledgebase when the end user changes the configuration.
An illustrative embodiment of the present invention provides a technique for determining which configuration change has caused a problem without the need for a knowledge database. Therefore, the present invention does not provide a root cause, but rather provides a “tolerance limit” that must be removed (that is, a final answer). Rather than being “root cause” oriented, the present invention provides a direct solution-oriented method for handling a problem.
The end user asks a question. The help desk starts an analysis. The initial step is to detect a target time period. In order to detect that target time period, a cause analysis program detects the last successful application invocation and the first application invocation failure based on both the event log and the application invocation history. With respect to the detection of a configuration change, the configuration change is determined by combining the configuration change history and the result of the target time period detection. These configuration changes may affect the invocation of an application. One of these configuration changes is the tolerance limit. The next step is to check another computer. To determine the configuration change that has the highest likelihood of being the cause, the cause analysis program checks other computers that have experienced the same configuration change. The cause analysis program checks and counts the results of application invocations before and after each configuration change. Upon discovering the same configuration change in another computer, the program checks whether the respective configuration changes caused the same problem in this computer, and whether the problem was fixed. The program counts similar cases for all the computers. Thereafter, the program computes the ratio of instances accompanying a change from success to failure and the ratio of instances accompanying a change from failure to success with respect to all the instances for the respective configuration changes. The cause analysis program displays these results thereafter. The results of the analysis are shown in the form of a ranking using a diagram. The help desk is able to respond to a question as to which configuration change is most readily affected.
The present invention does not use a knowledge database described by humans at all. Even when a user knows the root cause, the user may not know how to easily fix this root cause, and so the present invention does not seek to discover the root cause. Instead, the present invention identifies a tolerance limit. What the user has to do is remove this tolerance limit. For the user, fixing the problem is more effective than notifying the user as to the root cause.
An aspect of the present invention is oriented toward a cause analysis method for a target computer from among multiple computers, and in this method, the target computer is experiencing an application invocation failure of a computer application at a first failure time, and an application invocation success of the computer application at a first success time that precedes the first failure time without being accompanied by another application invocation success, and, in addition, without being accompanied by another application invocation failure of the computer application during a first time period between the first success time and the first failure time (for example, refer to
Another aspect of the present invention is oriented toward a cause analysis system, and this system comprises multiple computers comprising a target computer and an analysis computer that is coupled to these multiple computers. The analysis computer is programmed to execute the above-mentioned steps (1) and (2). In a number of the embodiments, the analysis computer becomes one of the above-mentioned multiple computers (For example, refer to
Another aspect of the present invention is oriented toward a computer-readable storage medium that stores multiple instructions for controlling a data processor that executes the above-mentioned steps (1) and (2).
In a number of the embodiments, the method also comprises a step of presenting the result of at least one of either a causal configuration change result (A1) or a fixing configuration change result (B1). In the step (A1), with respect to either one or multiple second configuration changes, the method comprises listing the number of instances of application invocation failures identified for all of the above-mentioned other computers, and all of the instances respectively accompanying either one or multiple second configuration changes for all of the above-mentioned other computers (for example, refer to
In a number of the embodiments, the steps (1) and (2) are executed with respect to multiple computer applications. In addition, the method comprises a step of presenting the result of at least one of either a causal configuration change result (A2) or a fixing configuration change result (B2). In the step (A2), with respect to either one or multiple second configuration changes, the method comprises listing the number of instances of application invocation failures identified for all of the above-mentioned other computers, and all of the instances respectively associated with either one or multiple second configuration changes for all of the above-mentioned other computers, and listing the date and time at which each of the multiple computer applications was analyzed (for example, refer to
The method in a specific embodiment further comprises a step of executing at least one of either a matching causal configuration changes analysis (A3) or a matching fixing configuration changes analysis (B3) with respect to a specified computer application. In the step (A3), the method comprises searching for the result of (A2) with respect to a computer application that is consistent with the specified computer application as a matching causal configuration change result, and fetching this matching causal configuration change result for analysis (for example, refer to
The method in a number of the embodiments further comprises a step of executing at least one of either a combined causal configuration changes analysis (C) or a combined fixing configuration changes analysis (D). In the step (C), with respect to all of the other computers besides the target computer of the multiple computers, the method comprises acquiring all of the above-mentioned other computers experiencing an application invocation success of the same computer application at a fourth success time, and an application invocation failure of the same computer application at a fourth failure time that is after the fourth success time without being accompanied by another application invocation success, and, in addition, without being accompanied by another application invocation failure of the same computer application during a fourth time period between the fourth success time and the fourth failure time by identifying an instance of another application invocation failure, identifying either one or multiple combinations of a fourth configuration change that occurred during the fourth time period, and totaling for all of the multiple computers with the exception of the target computer the number of causal configuration changes of a total of each of the combinations of the fourth configuration change (for example, refer to
In a number of the embodiments, the method comprises a step of presenting the result for at least one of the combined causal configuration change result (C1) or the combined fixing configuration change result (D1). In the step (C1), with respect to either one or multiple combinations of a fourth configuration change, the method comprises listing the number of instances of application invocation failures identified for all of the above-mentioned other computers, and all of the instances respectively accompanying either one or multiple combinations of the fourth configuration changes for all of the above-mentioned other computers. In the step (D1), with respect to either one or multiple combinations of a fifth configuration change, the method comprises listing the number of instances of application invocation successes identified for all of the above-mentioned other computers, and all of the instances respectively accompanying either one or multiple combinations of the fifth configuration changes for all of the above-mentioned other computers.
Another aspect of the present invention is oriented toward a method in a computer system for executing a cause analysis for a target computer from among multiple computers, and in this method, the target computer is experiencing an application invocation failure of a computer application at a first failure time, and an application invocation success of the computer application at a first success time that precedes the first failure time without being accompanied by another application invocation success, and, in addition, without being accompanied by another application invocation failure of the computer application during a first time period between the first success time and the first failure time. This method comprises a step of presenting a causal configuration changes table that lists either one or multiple first configuration changes, which occurred during the first time period of the computer application, and a graphical chart, which corresponds to each of the first configuration changes of this one or multiple first configuration changes, and which comprises a failure rate area and a success rate area. The failure rate area represents a failure case that identifies an instance of another application invocation failure in which, with respect to all of the other computers with the exception of the target computer of these multiple computers, all of the above-mentioned other computers experience an application invocation success of the same computer application at a second success time, and an application invocation failure of the same computer application at a second failure time that is after the second success time without being accompanied by another application invocation success, and, in addition, without being accompanied by another application invocation failure of the same computer application during a second time period between the second success time and the second failure time. A second configuration change that is equivalent to a corresponding first configuration change listed in the table occurs during the second time period. The success rate area represents a success case that identifies an instance other than another application invocation failure in which, with respect to all of the other computers with the exception of the target computer of the multiple computers, all of the above-mentioned other computers experience an application invocation success of the same computer application at a third success time, and an application invocation failure of the same computer application at a third failure time that is after the third success time without being accompanied by another application invocation success, and, in addition, without being accompanied by another application invocation failure of the same computer application during a third time period between the third success time and the third failure time. A third configuration change that is equivalent to a corresponding first configuration change listed in the table occurs during the third time period.
In a number of the embodiments, the above-mentioned graphical chart comprises a bar graph, the failure rate area shows at least one of either a number of failure cases or a percentage of the failure cases when the total number of both failure cases and success cases has been compared, and the failure success area shows at least one of either a number of success cases or a percentage of the success cases when the total number of both failure cases and success cases has been compared. The causal configuration changes table lists the configuration item(s) and change type(s) of either one or multiple first configuration changes, a change date time corresponding to either one or multiple first configuration changes, and a graphical chart showing the failure rate area and the success rate area. In addition, this method comprises a step of presenting the user with a sort key index for sorting the causal configuration changes table in accordance with any one of the configuration item, the change type, the change date time and the graphical chart, and a step of presenting, in response to the sort key index selection inputted by the user, the causal configuration changes table that has been sorted in accordance with the selection inputted by the user.
These and other characteristic features and advantages of the present invention should be clear to a person having ordinary skill in the art by studying the following detailed descriptions of the specific embodiments.
In the detailed explanation of the present invention that follows, reference will be made to the attached drawings which constitute a portion of the disclosure, and which show examples of embodiments as exemplary means for carrying out the present invention without limiting same. In the drawings, similar numbers describe substantially similar components in a number of diagrams. It should also be noted that these detailed explanations provide various examples of embodiments, which are described hereinbelow, and, in addition, are illustrated in the drawings, but the present invention is not limited to the embodiments that are described and illustrated herein, and as a person having ordinary skill in the art knows or will come to know, the present invention can be expanded to include other embodiments. When reference is made to “one embodiment”, “this embodiment”, or “these embodiments” in this specification, this signifies that the specific feature, structure or characteristic described in relation to an embodiment is included in at least one embodiment of the present invention, and the appearance of these phrases at various locations in this specification does not necessarily refer to the same embodiment. In addition, in the following detailed explanation, numerous specific details are given to provide a thorough understanding of the present invention. However, as should be clear to a person having ordinary skill in the art, not all of these specific details are required to carry out the present invention. Under other conditions, known structures, materials, circuits, processes, and interfaces are not explained in detail and/or are illustrated in block diagram format so as to avoid making the present invention unnecessarily obscure.
In addition, a number of parts of the following detailed explanation are provided in the form of an algorithm and reference signs operated on inside a computer. These algorithm definitions and reference signs are means used by a person having ordinary skill in the art in the field of data processing to more effectively communicate the essence of their innovation to other persons having ordinary skill in the art. The algorithm is a series of defined steps that lead to a desired end state or result. In the present invention, an executed step requires physical operation on a perceivable amount in order to achieve a perceivable result. Usually, but not always, these amounts take the form of either an electrical or magnetic signal or instruction with respect to which storing, transferring, bonding, comparing or another such operation is possible. Referring to these signals as bits, values, elements, signs, letters, items, numbers, instructions or the like often proves advantageous. However, it must be kept in mind that all of these and other similar terms are related to appropriate physical amounts, and are merely convenient labels applied to these amounts. Unless otherwise noted, as will be clear from the following considerations, it is recognized that the use of “process”, “computing”, “compute”, “determine”, “display” or other such terminology throughout his explanation may include the operation and processing of a computer system or other such information processing device, which operates on data that is expressed as a physical (electronic) amount inside the registers and memory of a computer system and converts this data to other data that is similarly expressed as a physical amount inside either the memory or registers of the computer system, or another information storage, transmission, or display device.
The present invention also relates to an apparatus for executing an operation thereinside. This apparatus is either specially designed for a desired purpose, or may include one or multiple general-purpose computers, which are either selectively booted or reconfigured by either one or multiple computer programs. This type of computer program may be stored inside a computer-readable storage medium, such as but not limited to an optical disk, a magnetic disk, a read-only memory, a random access memory, a solid state device and drive, or any other type of medium that is suitable for storing electronic information. An algorithm and display provided here are not inherently related to any specific computer or other apparatus. It can also be proven that either various general-purpose systems may be used together with programs and modules in accordance with the teachings included herein, or that there are advantages to configuring a more specialized apparatus to execute the steps of a desired method. In addition, the present invention will not be described by referring to any specific programming language. It should be recognized that the teachings of the present invention can be implemented as described herein using a variety of programming languages. A programming language (one or multiple) instruction(s) may be executed by one or multiple processing devices, for example, a central processing unit (CPU), a processor, or a controller.
An exemplary embodiment of the present invention, as will be explained in more detail hereinbelow, provides an apparatus, a method, and a computer program for finding a solution to an application failure by analyzing a configuration change without using a knowledge database.
The target computer 102 is a general-purpose computer comprising a CPU 151, a memory 152, a disk 153, a video interface 154, and a network interface 155. The respective elements are coupled by way of a system bus 156. The target computer 102 comprises an agent 161 for sending the log information 171 to the analysis computer 101 via the LAN 103. The target computer 102 comprises the log information 171 inside the disk 153. A display 157 is coupled to the video interface 154.
The cause analysis program 121 reads the log information 123 and executes a causal configuration changes analysis as described hereinbelow. The target period detector 131 reads the event log table 141, the application invocation history table 142, and the configuration change history table 143, and detects the time period between the point in time at which a specific application was able to be invoked without trouble, and the point in time at which this application was not able to be invoked without trouble (=failed). Thereafter, the target period detector 131 determines the configuration change in the target computer during this time period by referencing the configuration change history table 143. The causal configuration changes analyzer 132 checks the log information 123 of the other computer(s) and stores the result(s) in the causal configuration changes table 146. The causal configuration changes temporary table 144 is used as temporary data when the causal configuration changes analyzer 132 is analyzing the causal configuration change.
The fixing configuration changes analyzer 133 detects a fixing configuration change and stores the result in the fixing configuration changes table 147. The fixing configuration changes temporary table 145 is used as temporary data when the fixing configuration changes analyzer 133 is analyzing the fixing configuration change. The fixing configuration change is a configuration change for fixing a state in which there is an application invocation failure or other such trouble. The invocation result checker 134 is a subroutine that detects whether or not a specific application was able to be invoked without trouble by referencing both the event log table 141 and the application invocation history table 142.
Schematic 301 shows the state of the target computer 102 of this causal configuration changes analysis. According to the schematic 301, four configuration changes occurred between a successful invocation and a failed invocation of the application. There is no other invocation between these configuration changes. Therefore, the application invocation failure could have been caused by one of these four configuration changes. Schematics 302, 303, and 304 show the states of the other computers, and these states will be used for detailed analysis.
According to schematic 302, the same configuration changes occurred in the other computer A, but neither the removal of “VPN-CLIENT v1.8” nor the addition of “VPN-CLIENT v2.0” affected the invocation of this application. Therefore, the certainty with respect to these two configuration changes having affected the invocation of this application becomes lower. By contrast, the addition of the “PRINTER DRIVER A” and the “PATCH-2322” produced results between a success and a failure. Therefore, the certainty with respect to these two configuration changes having affected the invocation of this application becomes higher.
According to schematic 303, the addition of the “PRINTER DRIVER A” produced a result between a success and a failure. Therefore, the certainty with respect to this configuration change having affected the invocation of this application becomes higher. In addition, the removal of the “PRINTER DRIVER A” subsequent to the failure produced a result between a failure and a success. Therefore, the removal of the “PRINTER DRIVER A” is seen as having fixed the problem. The certainty with respect to this configuration change having fixed the application invocation becomes higher. In addition to this, since the addition of the “PATCH-2322” produced a result between a success and a success, the certainty with respect to this configuration change having affected the application invocation becomes lower.
According to schematic 304, the addition of a “PATCH-1234” comes between a failure and a success, and therefore the addition of the “PATCH-1234” is seen as having fixed the problem. This type of observation can lead to two types of results. The one is the certainty with respect to which configuration factor affected the invocation of the application. The other is the certainty with respect to which configuration change was able to fix the application invocation trouble (the failure). There are three other computers in this example, and the accuracy of the analysis could be increased further by adding the other computers.
Potential fixing configuration changes are displayed in the form of a ranking in the bottom pane. A column 532 shows the configuration item. A column 532 shows the change type. A column 533 shows bar graphs denoting the certainty corresponding to the configuration changes.
The event log table 141 comprises three columns, i.e., a computer ID (601), a date time (602) and an event type (603). The computer ID 601, the date time 602, and the event type 603 are collected from the agents 161 of the respective target computers 102 by the log collector program 122, and stored in this table. A table summary of the event log table in each target computer 102 is the same as that of the event log table 141 of the analysis computer 101 in this embodiment.
The configuration change history table 143 comprises four columns, i.e., a computer ID 701, a change date time 702, a configuration item. 703 and a change type 704. The data of these four columns is collected from the agents 161 of the respective target computers 102 by the log collector program 122 and stored in this table. A table summary of the configuration change history table of each target computer 102 is the same as that of the configuration change history table 143 of the analysis computer 101 in this embodiment. The configuration change history table in each target computer 102 comprises its own configuration change history data. The configuration change history table 143 in the analysis computer 101 comprises all of the configuration change history data collected from the respective target computers 102.
Examples of configuration items include software, an application (add/remove), a patch (add/remove), a driver (add/remove), an OS configuration, processor scheduling (program/background service), memory usage (program/system cache), optional registry items, hardware, memory capacity, hard drive capacity, BIOS configuration, hyper-thread (ON/OFF), and virtualization technology (ON/OFF).
A record shows the time period from the last successful application invocation to the first failed application invocation for each analysis target computer 102. It is supposed that the value of the computer ID 901 of the analysis target computer 102 in
In Step 1403, the cause analysis program 121 treats the values of the computer ID 411 and the application name 412 as parameters, and calls the target period detector 131. The result is stored in the causal configuration changes temporary table 144. The records (911 through 914) are stored in the configuration changes temporary table 144 at the time of this step. Therefore, the configuration change that caused the application invocation to fail is deemed to be one of these configuration changes (911 through 914).
In Step 1404, the cause analysis program 121 treats the values of the computer ID 411 and the application name 412 as parameters, and calls the causal configuration changes analyzer 132. The result is stored in the causal configuration changes table 146. The records (1011 through 1014) are stored in the causal configuration changes table 146 at the time of this step.
In Step 1405, the cause analysis program 121 treats the value of the application name 412 as a parameter, and calls the fixing configuration changes analyzer 133. The result is stored in the fixing configuration changes table 147. The records (1211 through 1213) are stored in the fixing configuration changes table 147 at the time of this step. In Step 1406, the cause analysis program 121 displays the results on the display 117.
In Step 1502, the target period detector 131 selects the records of the same computer ID as the computer ID that was received in Step 1501 from the configuration change history table 143. The detector 131 also sorts the records in descending order in accordance with the change date time 702. The records of the “Comp-001” of the computer ID 701 are selected at the time of this step (711 through 716 in the configuration change history table 143).
In Step 1503, the target period detector 131 checks whether or not the records were selected in Step 1502. In the case of a YES, processing proceeds to Step 1504. In the case of a NO, processing ends.
In Step 1504, the target period detector 131 fetches one record from the top, and reads the values of the change date time 702, the configuration item 703, and the change type 704. At the initial execution of this loop, the value of the change date time 702 is “Jun. 4, 2008 08:20:11”, the configuration item 703 is “PRINTER DRIVER A”, and the change type 704 is “Added”.
In Step 1505, the target period detector 131 treats the computer ID (Step 1501), the application name step (1501) and the change date time (Step 1504) as parameters, and calls the invocation result checker 134. In the initial execution of this loop, these parameters are “Comp-001”, “DOC EDITOR”, and “Jun. 4, 2008 08:20:11”.
In Step 1506, the target period detector 131 receives the values of the Invocation-Before and Invocation-After variables as the results of Step 1505. The results show whether or not the application was able to be invoked before and after the configuration change without any errors. In the initial execution of this loop, the Invocation-Before result value is “success” and the Invocation-After result value is “failure”.
In Step 1507, the target period detector 131 checks whether or not the post-configuration change invocation result is success. In the case of a YES, the processing ends. In the case of a NO, the processing proceeds to Step 1508.
In Step 1508, the target period detector 131 creates a record for the computer ID 901, the change date time 902, the configuration item 903, the change type 904, the invocation-before 905 and the invocation-after 906. The detector 131 also inserts this record in the causal configuration changes temporary table 144. The record 911 is stored in the causal configuration changes temporary table 144 at the time of the initial loop of this step.
In Step 1509, the target period detector 131 checks whether or not all the records selected in Step 1502 have undergone processing. In the case of a YES, the processing ends. In the case of a NO, the processing returns to Step 1504. In this embodiment, the records (911 through 914) are stored in the configuration changes temporary table 144 subsequent to execution of the target period detector 131.
In Step 1602, the invocation result checker 134 acquires the application invocation time of immediately prior to the change date time (Step 1601) by referencing the application invocation history table 142 with respect to the computer ID (Step 1601) and the application name (Step 1601). When the received change date time is “Jun. 4, 2008 08:20:11”, the “DOC EDITOR” application invocation time immediately prior to “Jun. 4, 2008 08:20:11” can be found as “Jun. 2, 2008 14:26:03” (818) in the application invocation history table 142.
In Step 1603, the invocation result checker 134 counts the number of events within a specified period of time immediately after the application invocation time (Step 1602) by referencing the event log table 141 with respect to the computer ID (Step 1601). When the invocation time is “Jun. 2, 2008 14:26:03”, the number of events within a 10 second period is 0.
In Step 1604, the invocation result checker 134 checks the number of events counted in Step 1603. When this number is larger than 0, the processing jumps to Step 1606. Otherwise, the processing proceeds to Step 1605. In Step 1605, the invocation result checker 134 sets the Invocation-Before variable to success. The Invocation-Before variable is set to success because the number of events is 0. In Step 1606, the invocation result checker 134 sets the Invocation-Before variable to failure.
In Step 1607, the invocation result checker 134 acquires the application invocation time immediately after the change date time (Step 1601) by referencing the application invocation history table 142 with respect to the computer ID (Step 1601) and the application name (Step 1601). When the received change date time is “Jun. 4, 2008 08:20:11”, the “DOC EDITOR” application invocation time immediately after “Jun. 4, 2008 08:20:11” can be found in the application invocation history table 142 as “Jun. 4, 2008 08:29:23” (record 417).
In Step 1608, the invocation result checker 134 counts the number of events within a specified time period immediately after the application invocation time (Step 1607) by referencing the event log table 141 with respect to the computer ID (Step 1601). When the invocation time is “Jun. 4, 2008 08:29:23”, the number of events within a 10-second period is 1.
In Step 1609, the invocation result checker 134 checks the number of events counted in Step 1608. When this number is larger than 0, the processing jumps to Step 1611. Otherwise, the processing proceeds to Step 1610. In Step 1610, the invocation result checker 134 sets the Invocation-After variable to success. In Step 1611, the invocation result checker 134 sets the Invocation-After variable to failure. The Invocation-After variable is set to failure because the number of events is 1.
In Step 1612, the invocation result checker 134 returns the values of the Invocation-Before and Invocation-After variables. In this explanation, the Invocation-Before return value is success, and the Invocation-After value is failure.
In Step 1704, the causal configuration changes analyzer 132 checks whether or not all of the items in the list received in Step 1703 have been processed. In the case of a YES, the processing ends. In the case of a NO, the processing proceeds to Step 1705. In Step 1705, the causal configuration changes analyzer 132 fetches one item from the list (Step 1703) and reads the configuration item, the change type and the change date time. In Step 1706, the causal configuration changes analyzer 132 references the causal configuration changes temporary table 144 and counts the number of records that satisfy a condition, such as (configuration item (Step 1705)=configuration item in the table 144), (change type (Step 1705)=change type in the table 144), (Invocation-Before=success in the table 144), and (Invocation-After=failure in the table 144). In Step 1707, the causal configuration changes analyzer 132 references the causal configuration changes temporary table 144 and counts the number of records that satisfy a condition, such as (configuration item (Step 1705)=configuration item in the table 144) and (change type (Step 1705)=change type in the table 144). In Step 1708, the causal configuration changes analyzer 132 inserts the result records in the causal configuration changes table 146. These records include the configuration item (Step 1705), the change type (Step 1705), the change date time (Step 1705), the number of failure records (result of Step 1706), and the number of all related records (result of Step 1707). The records (1011 through 1014) are stored in the causal configuration changes table 146 at the time of this step.
In Step 1804, this subroutine checks whether or not all the records selected in Step 1803 have been processed. In the case of a YES, the processing ends. In the case of a NO, the processing proceeds to Step 1805. In Step 1805, this subroutine fetches one record from the records selected in Step 1803, and reads the computer ID 901, the change date time 902, the configuration item. 903, and the change type 904. In Step 1806, this subroutine calls the invocation result checker 134 together with the values of the computer ID (Step 1805), the application name (Step 1801) and the change date time (Step 1805). In Step 1807, this subroutine receives the values of the Invocation-Before and Invocation-After variables as the result of Step 1806. The result shows whether or not the application was able to be invoked without any errors before and after the configuration change. In Step 1808, this subroutine creates a record comprising the computer ID 901, the change date time 902, the configuration item 903, the change type 904, the invocation-before 905 and the invocation-after 906. The subroutine also inserts this record in the causal configuration changes temporary table 144.
In Step 1906, the fixing configuration changes analyzer 133 checks whether or not all the pairs selected in Step 1904 have been processed. In the case of a YES, the processing ends. In the case of a NO, the processing proceeds to Step 1907. In Step 1907, the fixing configuration changes analyzer 133 fetches one pair and reads the configuration item 1103 and the change type 1104. In Step 1908, the fixing configuration changes analyzer 133 references the fixing configuration changes temporary table 145 and counts the number of records that satisfy a condition, such as (configuration item (Step 1907)=configuration item in the table 145), (change type (Step 1907)=change type in the table 145), and (Invocation-After =success in the table 145). In Step 1909, the fixing configuration changes analyzer 133 references the fixing configuration changes temporary table 145 and counts the number of records that satisfy a condition, such as (configuration item (Step 1907)=configuration item in the table 145) and (change type (Step 1907)=change type in the table 145). In Step 1910, the fixing configuration changes analyzer 133 inserts the result records in the fixing configuration changes table 147. These records include the configuration item (Step 1907), the change type (Step 1907), the number of success records (result of Step 1908), and the number of all related records (result of Step 1909).
In Step 2003, this subroutine checks whether or not all the records selected in Step 2002 have been processed. In the case of a YES, the processing jumps to Step 2012. In the case of a NO, the processing proceeds to Step 2004. In Step 2004, this subroutine fetches one record from among the records selected in Step 2002, and reads the computer ID 901 and the change date time 902. In Step 2005, this subroutine selects from the configuration change history table 143 a record that satisfies a condition, such as (computer ID 701=computer ID (Step 2004)) and (change date time 702>change date time (Step 2004)).
In Step 2006, this subroutine checks whether or not all the records selected in Step 2005 have been processed. In the case of a YES, the processing moves to Step 2003. In the case of a NO, the processing proceeds to Step 2007. In Step 2007, this subroutine fetches one record from the records selected in Step 2005, and reads the computer ID and the change date time. In Step 2008, this subroutine calls the invocation result checker 134 together with the values of the computer ID (Step 2007), the application name (Step 2001) and the change date time (Step 2007). In Step 2009, this subroutine receives the values of the Invocation-Before and the Invocation-After variables as the result of Step 2008. In Step 2010, this subroutine creates a record comprising the computer ID 1101, the change date time 1102, the configuration item 1103, the change type 1104, the invocation-before 1105 and the invocation-after 1106. This subroutine also inserts this record in the fixing configuration changes temporary table 145. In Step 2011, this subroutine checks whether or not the Invocation-After value (Step 2009) is success. In the case of a YES, the processing returns to Step 2003. In the case of a NO, the processing returns to Step 2006. In Step 2012, this subroutine removes the duplicate of the record in the fixing configuration changes temporary table 145.
In Step 2402, the causal configuration changes analyzer 132 checks whether or not all of the items in the list created in Step 2401 have been processed. In the case of a YES, the processing ends. In the case of a NO, the processing proceeds to Step 2403. In Step 2403, the causal configuration changes analyzer 132 fetches one item from the list (Step 2401), and reads the configuration item, the change type, and the change date time. In Step 2404, the causal configuration changes analyzer 132 references the causal configuration changes temporary table 144 and counts the number of computers that satisfies a condition, such as (configuration item and change type pair (Step 2403)=configuration item and change type pair of computer in table 144), and (most recent Invocation-After value of this computer=failure in table 144). In Step 2405, the causal configuration changes analyzer 132 references the causal configuration changes temporary table 144 and counts the number of computers that satisfies a condition, such as (configuration item and change type pair (Step 2403)=configuration item and change type pair of computer in table 144). In Step 2406, the causal configuration changes analyzer 132 inserts the result record into the causal configuration changes table 146-23 (
Naturally, the system configuration illustrated in
In the explanation, many details are presented for the purpose of providing a complete understanding of the present invention. However, as should be clear to a person having ordinary skill in the art, not all of these specific details are necessary for carrying out the present invention. It should also be noted that the present invention may be described as a process that is generally illustrated as a flowchart, a flow diagram, a structural diagram or a block diagram. The flowchart may describe an operation as a consecutive process, but most operations are able to be executed either in parallel or simultaneously. In addition to this, the order of an operation may be rearranged.
As is widely known in this field, the above-described operations may be executed by hardware, software, or some combination of hardware and software. Various aspects of the embodiments of the present invention may be implemented using circuits and logical devices (hardware), but in a case where another aspect is stored on a machine-readable medium and executed by a processor, this other aspect of the present invention may be implemented by using an instruction (software) that causes the execution of a method for accomplishing the embodiment of the present invention in this processor. In addition, a number of embodiments of the present invention may be executed using only hardware, and another embodiment may be executed using only software. Furthermore, the various functions that have been described may be executed inside a single unit, or may be transferred to a large number of components and distributed using numerous methods. In a case where the present invention is executed using software, it is possible to execute a method by a processor of a general-purpose computer or the like based on an instruction stored on a computer-readable medium. In a preferred case, it is possible to store the instruction on a medium using compression and/or an encrypted format.
From the above, it should be clear that the present invention is a method, an apparatus, and a program stored on a computer-readable medium for finding a solution to an application failure by analyzing a configuration change without using a knowledge database. In addition to this, specific embodiments have been illustrated and described in this specification, but a person having ordinary skill in the art will recognize that an arbitrary combination calculated to achieve the same object can be used in place of the disclosed specific embodiments. This disclosure is intended to protect any and all adaptations or variations of the present invention, and it is assumed that it will be understood that the terminology used in the following claims should not be interpreted as limiting the present invention to the specific embodiments disclosed in the specification. Rather, the scope of the present invention is completely determined by the following claims, which should be interpreted together with the entire scope of equivalents that have the right of the claims in accordance with established claim interpretation theory.
In a fifth embodiment, an agent in a target computer monitors for application installation, and at the timing at which an installation is detected, notifies an analysis computer of an installation start event. A tabulation program is added to the analysis computer in this embodiment. In a case where an application name corresponds to the notified application in the causal configuration changes table, the tabulation program sends the relevant analysis result to the target computer.
The agent 161 has an application monitoring means 2602 for monitoring for and issuing a notification about an application installation, and analysis information management means 2603 for receiving, processing, and storing an analysis result, and performing outputting via a user interface 2606. The received information is saved as a trouble configuration changes table 2604 and a solution configuration changes table 2605.
The flow of processing of the agent program will be explained (not shown in the drawing). Upon detecting the start of an application installation by the invocation of an installer program, the agent program sends to the analysis computer a configuration change event comprising information such as the addition of a relevant application name and change type.
The agent 161, upon receiving analysis result information from the analysis computer, creates and updates the trouble configuration changes table 2604 and/or the solution configuration changes table 2605 based on the analysis result information. Then, the agent 161 creates and outputs a result-based screen.
The agent may halt the application installation process at this point. In accordance with this, a user interface for notifying the user of the cancellation is provided. In a case where an installation is cancelled, the agent waits for the reception of an analysis result, and in a case where nothing is received within a predetermined period of time, resumes the installation process. When a received analysis result is outputted to the screen, together with the message “Continue installation?”, the agent provides a user interface that enables the user to choose to either continue or to cancel the installation.
Next, the trouble configuration changes table 2604 will be explained. The trouble configuration changes table is for showing a configuration change, which has been analyzed by the analysis computer 101 as being a potential problem. The trouble configuration changes table 2604 has the time at which the analysis computer carried out analysis processing for each installation-target (may also include a target that is to be installed in the future) application, and one or more records. Furthermore, the relevant record(s) has/have the following attribute values.
Next, the solution configuration changes table 2605 will be explained. The solution configuration changes table shows a configuration change, which has been analyzed by the analysis computer 101 as being a possible solution to the trouble. The solution configuration changes table has the time at which the analysis computer carried out analysis processing for each installation-target (may also include a target that is to be installed in the future) application, and one or more records. Furthermore, the relevant record(s) has/have the following attribute values.
A case in which the analysis result information includes a configuration change that constitutes trouble will be explained below. In this case, the execution of Steps 2706 through 2709 of
(Step 2701) The tabulation program receives a configuration change event.
(Step 2702) The tabulation program searches the causal configuration changes table for the received application name.
(Step 2703) The tabulation program determines whether or not there is a record with an application name that matches the received application name in accordance with the search of Step 2702.
(Step 2704) The tabulation program computes as the certainty the percentage of the number of failure cases with respect to the total number of cases for each configuration item record.
(Step 2705) The tabulation program next selects a record to be sent based on the certainty.
Then, the tabulation program includes the information of the selected record in the analysis result information, sends this result information to the target computer, which is the source of the configuration change event notification (Step 2710), and ends the processing.
Furthermore, as the selection method of Step 2705, there is a method that uses a threshold, and a method that is limited to a specified number.
(Method 1) In the case of the method that uses a threshold, the tabulation program receives a certainty threshold from either the administrator or the target computer user, and manages this threshold by recording the threshold in the memory 112. The tabulation program compares the certainty with the threshold, and only selects a configuration item record with a certainty that is equal to or greater than the threshold.
(Method 2) In a case that is limited to a specified number, the specified number (for example, 3) is defined beforehand. The tabulation program selects the top three records based on the computed certainty. In a case where the definitions of the threshold and the specified number are NULL, all the records are selected.
Furthermore, the same as in
(Step 2706) The tabulation program searches the fixing configuration changes table and selects the relevant application information.
(Step 2707) The tabulation program determines whether or not the application name record matches the application name searched for Step 2706.
(Step 2708) The tabulation program computes the percentage of the number of failure cases with respect to the total number of cases as the certainty for each configuration item record.
(Step 2709) The tabulation program selects a record to be sent next based on the certainty, and includes this record in the analysis result information.
The agent 161 receives the analysis result information and uses the record included in the analysis result information to create and update the solution configuration changes table. In this embodiment, since the trouble has yet to occur in this application of the target computer, the solution may be to output the trouble information at the same time, or to output the trouble information separately when there is a request from the user interface.
As a variation of the fifth embodiment, a method based on the configuration of the configuration change event source computer will be explained as the method for selecting the record to be sent instead of the method based on the certainty. The processing up to Step 2703 is the same, and the record that matches the application name in the causal configuration changes table is selected. Next, the tabulation program references the configuration change history table shown in
The selection of these records to be sent is carried out to provide the minimal amount of information required. It is possible to reduce the traffic to the target computer by deleting information related to a low certainty and a configuration item that is not related to the configuration of the relevant computer.
In addition, as another example of the fifth embodiment, a method for providing information not only when an application is installed, but also at the time of a configuration item change, such as the application of a patch or the removal of a piece of software will be explained. The agent not only monitors for an application installation, but also for configuration item changes such as the application of a patch or the updating or removal of a driver, and upon detection thereof, notifies the analysis computer of a configuration change event. Here, configuration items are included as the application name, and change types such as addition and removal are included in the configuration change event.
In the first embodiment, the addition of a patch or the removal of a software program is recorded in the causal configuration changes table as a configuration item that constitutes a causal candidate for another application invocation failure. Accordingly, the tabulation program receives a configuration change event, and in a case where a search of the causal configuration changes table for the received application name in the flow of processing shown in
The same concept may be applied to the fixing configuration changes table as well.
As another variation, a method for using the information of the causal configuration changes table in which configuration items are combined will be explained. The result of an analysis carried out based on a combination of configuration changes, which was shown in the third embodiment, is stored in the causal configuration changes table shown in
The point in the processing of the tabulation program in this example that differs from the flow of processing shown in
In a fifth embodiment, there is a previous cause analysis request with relation to a relevant application, and when an analysis is carried out, the result thereof is provided to the end user of the target computer. In a case where an analysis is not carried out at this point or to provide the latest analysis result, the method in this embodiment is such that an analysis is executed at the point in time at which a configuration change event is received, and the end user is provided with this result.
Rather than an analysis subsequent to an application invocation failure in the analysis target computer, and analysis is performed as to whether there is case in which there was trouble when invoking an application in another computer in a case where the relevant application was installed.
After the cause analysis program 121(a) has ended, the tabulation program computes the number of failure cases with respect to the total number of cases as the certainty for each configuration item (Step 2704) for the causal configuration changes table obtained as the result of the analysis, selects the record to be sent based on the certainty (Step 2705), and sends this record to the agent (Step 2710) the same as in the fifth embodiment.
Furthermore, the cause analysis program 121(a) may create a fixing configuration changes table the same as in the first embodiment. In accordance with this, the certainty is also computed, the record to be sent is determined based on the certainty, and the record is sent to the agent in the same way for the fixing configuration changes table obtained as an analysis result.
(Step 3201) The cause analysis program receives a computer ID and an application ID from the tabulation program.
(Step 3202) The cause analysis program initializes the temporary and result tables.
(Step 3203) Next, the cause analysis program reads the configuration change history table and selects the record of the same computer ID as the received computer ID. In a case where the data in the configuration change history table has been stored for a long period of time, the selection target may be limited to a record up to a specified period of time prior to this.
(Step 3204) The cause analysis program stores the selection result in the causal configuration changes temporary table. Since an invocation check is not carried out with respect to this computer ID, the Invocation-Before and Invocation-After columns of the stored record remain NULL.
(Step 3205) Next, the cause analysis program analyzes the cases of the other target computers. Since the analysis processing is the same as that of the first embodiment, the subroutine of
(Step 3206) The cause analysis program receives a configuration change list as the return value of the subroutine. The items in this list include the configuration item, the change type, and the change date time.
(Step 3207) The cause analysis program checks whether or not all the items in the list have been processed. In a case where all the items have not been processed, the program proceeds to Step 3208.
(Step 3208) The cause analysis program references the result of the subroutine (Step 3205) and the created causal configuration changes temporary table, and counts the number of records that satisfies the following conditions. It is supposed that the conditions are that the configuration item and change type of the configuration change list be the same as the configuration item and change type of the table, and, in addition, that the “Invocation-Before=success” and the “Invocation-After =failure”.
(Step 3209) The cause analysis program references the causal configuration changes temporary table, and, with the exception of the record with respect to the computer ID received in Step 3201, counts the number of records in which the configuration item and change type of the configuration change list are the same as the configuration item and change type of the table.
(Step 3210) The cause analysis program registers the result record in the causal configuration changes table. The causal configuration changes table has the table configuration shown in
As a variation of the sixth embodiment, in a case where a causal configuration changes table for the relevant application already exists, the information of this table is used without carrying out a new analysis provided that this information is not equal to or greater than a specified period of time prior based on the analysis date/time. In accordance with this, after Step 2702 in the processing of the tabulation program shown in
As a variation of either the fifth or sixth embodiment, a method for outputting data held by the agent will be explained. The trouble configuration changes table and the solution configuration changes table, which are received from the analysis computer and saved, comprise either a causal configuration item or a fixing configuration item and the certainty therefor for each application. The agent provides a user interface for receiving an application name input, and when the user inputs this application name, searches the table for the information related to this application, and in a case where the information exists, outputs this information as in the examples shown in
Furthermore, as another sixth embodiment, there is a method for regularly invoking the tabulation program and updating information in the analysis computer. Unlike the processing of
There is a method for carrying out an analysis of an application invocation failure case affected by a user operation in cases other than a configuration change, such as the installation of an application or the application of a patch. In accordance with this, the log collection program of the analysis computer collects a user operation log, and manages this log using an operation history table. The agent uses configuration change monitoring means to monitor for a user operation, such as an OS setting change, and in a case where a specific operation has occurred, notifies the analysis computer. The tabulation program and the cause analysis program in the analysis computer search the operation history table, analyze a case in which an application invocation failure occurred subsequent to performing a similar operation, and tabulate a certainty. The result is sent to the agent, and the agent outputs this result to a screen.
In accordance with the above, it is possible to provide, at the time when the end user carries out a configuration change, an analysis result with respect to application trouble caused by a configuration change in the computer using an analysis computer without using a knowledge database, and to realize support for urging the end user to deal with the trouble.
In the above fifth and sixth embodiments, it was explained that in an application implementation method in the computer system comprising multiple target computers and an analysis computer, one or more first target computers, which are included in the multiple target computers and in which a predetermined application has been installed and invoked, send a log comprising information of multiple configuration changes that have been made prior to invoking the predetermined application to the analysis computer, and the analysis computer receives the log and computes, for each type of configuration change and based on the log, an invocation failure rate which is a percentage at which the invocation of the predetermined application fails subsequent to the configuration change.
Furthermore, it was explained that a second target computer, which is included in the multiple target computers and is a target computer other than the one or more first target computers: (1) receives, from the analysis computer, first information comprising an invocation failure rate for each type of configuration change related to the predetermined application, and (2) based on the invocation failure rate, displays the type of configuration change that is the cause of the failure of the predetermined application invocation.
Furthermore, it was explained that the invocation failure rate included in the first information is an invocation failure rate for each type of part of all the types of configuration change that detected by the analysis computer based on the log, and that the part of types may be selected by the analysis computer based on the invocation failure rate.
Further, it was explained that the analysis computer computes, for each type of configuration change and based on the log, an invocation success rate which is a percentage at which the invocation of the predetermined application succeeds subsequent to the configuration change, and the second target computer (3) receives, from the analysis computer, second information comprising an invocation success rate for each type of configuration change related to the predetermined application, and (4) based on the invocation success rate, may display the type of configuration change that is the cause of the successful invocation of the predetermined application.
Furthermore, it was explained that the second target computer may have multiple applications installed besides the predetermined application, and the second target computer may select, from among the predetermined application and the multiple applications, an application for which the invocation could fail with respect to a predetermined type included in all the configuration change types, and display an identifier of the selected application.
Furthermore, it was explained that the type is an example of a configuration item and/or a change type, but another example might be a configuration change operation type.
Furthermore, in the above explanation, the information of the present invention has been explained by using expressions such as “aaa table”, “aaa list”, “aaa DB”, and “aaa queue”, but this information may also be expressed using a data structure other than a table, list, DB, or queue. For this reason, “aaa table”, “aaa list”, “aaa DB”, and “aaa queue” may also be called “aaa information” to indicate that the information is not dependent on the data structure. In addition, the expressions “identification information”, “identifier”, “name” and “ID” have been used when explaining the content of the respective information, but these expressions are interchangeable.
Furthermore, the analysis computer may be multiple computers. The memory 152 and disk 153 of the target computer 102 may be lumped together as a storage resource (that is, a storage device) without making a distinction between the two. Similarly, the memory 112 and the disk 113 of the analysis computer 101 may also be lumped together as a storage resource without making a distinction between the two.
Furthermore, in the above explanation, there were cases in which the explanation was given with “program” as the subject, but since a process, which is determined by a program being executed by a processor, is carried out while using a memory and a communication port (a communication control device), the explanation may also be given by using the processor as the subject. Further, a process that has been disclosed as having a program as the subject may be a process that is carried out by a management server or other such computer, or an information processing apparatus. In addition, either all or a portion of a program may be realized using dedicated hardware.
Furthermore, the respective types of programs may be installed in the respective computers using a program delivery server or a computer-readable storage media.
Number | Date | Country | Kind |
---|---|---|---|
2010-140104 | Jun 2010 | JP | national |
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/JP2010/061563 | 7/7/2010 | WO | 00 | 10/15/2010 |