The present disclosure relates to the field of analysis of event records and, more particularly, to the field of determining a root cause of anomalous events in a system.
Systems such as medical scanners generate a plurality of machine logs. Such logs are records of events that may have occurred in the system and include information associated with such events. The event records may also be indicators of anomalous events or defects that arise in the system. The event records are, however, generated in large numbers. Therefore, identifying a root cause of the anomalous events from the event records may be difficult and time consuming. Additionally, the event records may include complex information that may not be readable or understandable by a user. For example, the event records may include technical keywords associated with the system that may be difficult for the user to understand. Therefore, identification of event records associated with the anomalous event may not be straightforward and thereby lead to difficulty in identifying the cause of bug in the system. Further, the event records cannot be annotated using manual effort to distinguish between normal and anomalous events occurring in the system.
There is a need for a method and a system to determine a root cause of anomalous events in a system by effectively managing event records and prioritizing the anomalous event for the event records.
The present embodiments may obviate one or more of the drawbacks or limitations in the related art. For example, a method and a system to determine a root cause of anomalous events in a system are provided.
A method, device, and system for determining a root cause associated with an anomalous event is disclosed. In one aspect, the method includes retrieving one or more event records associated with a device from a database, where the event records include data associated with a functioning of the device. At least one of the one or more event records is associated with the anomalous event. The method also includes determining a risk category associated with the one or more event records based on an information present in the one or more event records, where the risk category indicates a risk associated with the functioning of the device. Additionally, the method includes determining a priority associated with each of the one or more event records based on a baseline associated with the event records, where the baseline is defined based on a set of events that occur during a normal functioning of the device. Further, the method includes determining the root cause associated with the anomalous event in the device based on the risk category and the priority associated with the event records.
In another aspect, a device for determining a root cause associated with an anomalous event includes a processing unit and a memory coupled to the processing unit. The memory includes a root cause identification module configured for retrieving one or more event records associated with the device from a database, where the event records include data associated with a functioning of the device. At least one of the one or more event records is associated with the anomalous event. The root cause identification module is further configured for determining a risk category associated with the one or more event records based on an information present in the one or more event records. The risk category indicates a risk associated with the functioning of the device. Further, the root cause identification module is configured for determining a priority associated with each of the one or more event records based on a baseline associated with the event records. The baseline is defined based on a set of events that occur during a normal functioning of the device. Additionally, the root cause identification module is configured for determining the root cause associated with the anomalous event in the device based on the risk category and the priority associated with the event records.
In another aspect, a system for determining a root cause associated with an anomalous event includes one or more servers and a device communicatively coupled to the servers. The servers include one or more instructions that, when executed, cause the server to retrieve one or more event records associated with the device from a database. The event records include data associated with a functioning of the device, where at least one of the one or more event records is associated with the anomalous event. The instructions further cause the servers to determine a risk category associated with the one or more event records based on information present in the one or more event records. The risk category indicates a risk associated with the functioning of the device. Further, the instructions cause the servers to compute a priority associated with each of the one or more event records based on a baseline associated with the event records. The baseline is defined based on a set of events that occur during a normal functioning of the device. Additionally, the instructions cause the server to determine the root cause associated with the anomalous event in the device based on the risk category and the priority associated with the event records.
In yet another aspect, a non-transitory computer-readable storage medium having machine-readable instructions stored therein is provided. When executed by the server, the machine-readable instructions cause the server to perform the method acts as described above.
This summary is provided to introduce a selection of concepts in a simplified form that are further described below in the following description. The summary is not intended to identify features or essential features of the claimed subject matter. Further, the claimed subject matter is not limited to implementations that solve any or all disadvantages noted in any part of this disclosure.
Hereinafter, embodiments for carrying out the present invention are described in detail. The various embodiments are described with reference to the drawings, where like reference numerals are used to refer to like elements throughout. In the following description, for purpose of explanation, numerous specific details are set forth in order to provide a thorough understanding of one or more embodiments. It may be evident that such embodiments may be practiced without these specific details. In other instances, well known materials or methods have not been described in detail in order to avoid unnecessarily obscuring embodiments of the present disclosure. While the disclosure is susceptible to various modifications and alternative forms, specific embodiments thereof are shown by way of example in the drawings and will herein be described in detail. There is no intent to limit the disclosure to the particular forms disclosed, but on the contrary, the disclosure is to cover all modifications, equivalents, and alternatives falling within the spirit and scope of the present disclosure.
The client device 107A-N are user devices used by users (e.g., a technician, etc.). In an embodiment, the user device 107A-N may be used by the user to receive data associated with the device 108. The data may be accessed by the user via a graphical user interface of an end user web application on the user device 107A-N. In another embodiment, a request may be sent to the server 101 to access the data associated with the device 108 via the network 105. The device 108 may be connected to the server 101 through the network 105. The device 108 may be a medical imaging unit 108 capable of acquiring a plurality of medical images. The medical imaging unit 108 may be, for example, a scanner unit such as a computed tomography imaging unit, a molecular imaging unit, an X-ray fluoroscopy imaging unit, a magnetic resonance imaging unit, an ultrasound imaging unit, etc. Alternatively, the device 108 may be any equipment or apparatus configured to perform one or more functions as instructed.
The processing unit 201, as used herein, may be any type of computational circuit, such as, but not limited to, a microprocessor, microcontroller, complex instruction set determining microprocessor, reduced instruction set determining microprocessor, very long instruction word microprocessor, explicitly parallel instruction determining microprocessor, graphics processor, digital signal processor, or any other type of processing circuit. The processing unit 201 may also include embedded controllers, such as generic or programmable logic devices or arrays, application specific integrated circuits, single-chip computers, and the like. In general, a processing unit 201 may include hardware elements and software elements. The processing unit 201 may be configured for multithreading (e.g., the processing unit 201 may host different calculation processes at the same time, executing in parallel or switching between active and passive calculation processes).
The memory 202 may be volatile memory and non-volatile memory. The memory 202 may be coupled for communication with the processing unit 201. The processing unit 201 may execute instructions and/or code stored in the memory 202. A variety of computer-readable storage media may be stored in and accessed from the memory 202. The memory 202 may include any suitable elements for storing data and machine-readable instructions, such as read only memory, random access memory, erasable programmable read only memory, electrically erasable programmable read only memory, a hard drive, a removable media drive for handling compact disks, digital video disks, diskettes, magnetic tape cartridges, memory cards, and the like. In the present embodiment, the memory 202 includes a root cause identification module 103 stored in the form of machine-readable instructions on any of the above-mentioned storage media and may be in communication to and executed by processing unit 201. When executed by the processing unit 201, the root cause identification module 103 causes the processing unit 201 to determine a root cause associated with an anomalous event. Method acts executed by the processing unit 201 to achieve the abovementioned functionality are elaborated upon in detail in
The storage unit 203 may be a non-transitory storage medium that stores a database 102. The database 102 is a repository of information related to one or more events that may occur in the device 108. The input unit 205 may include one or more inputs such as, for example, a keypad, a touch-sensitive display, a camera (e.g., a camera receiving gesture-based inputs), etc. capable of receiving input signal. The bus 207 acts as an interconnect between the processing unit 201, the memory 202, the storage unit 203, the network interface 104, the input unit 205 and the output unit 206.
Those of ordinary skilled in the art will appreciate that the hardware depicted in
A device in accordance with an embodiment of the present disclosure includes an operating system employing a graphical user interface. The operating system permits multiple display windows to be presented in the graphical user interface simultaneously with each display window providing an interface to a different application or to a different instance of the same application. A cursor in the graphical user interface may be manipulated by a user through the pointing device. The position of the cursor may be changed and/or an event such as clicking a mouse button, generated to actuate a desired response, may be performed.
One of various commercial operating systems, such as a version of Microsoft Windows™, a product of Microsoft Corporation located in Redmond, Wash., may be employed if suitably modified. The operating system is modified or created in accordance with the present disclosure as described.
Disclosed embodiments provide systems and methods for analyzing event records. For example, the systems and methods may determine a root cause associated with an anomalous event.
Referring to
At act 402, at least one normal event occurring in the device 108 is identified. The event records generated for the device 108 may include information associated with events that are known to occur during a normal functioning of the device 108. Such events may be identified as normal events. Since the baseline is a depiction of normal events in the device 108, at least one normal event is identified in the device 108. The method acts for the identification of the at least one normal event is disclosed in further detail in
At act 404, a probability of occurrence of the event in the device 108 is determined based on the event ID. For example, the probability of the occurrence of the event in the device may be determined in machine days (e.g., in how many days the event may occur in the device 108). In a further embodiment, a statistical distribution associated with occurrence of the event in the device 108 is calculated. The statistical distribution may be, for example, a normal distribution, a lognormal distribution, or an exponential distribution. At act 405, an average number of occurrences of the event in the device 108 is determined based on the event ID. At act 406, a presence of a deviation in the occurrence of the event is identified based on the average number of occurrences of the event in the device 108. The presence of deviation or a low probability value may be an indication of an outlier in the probability of occurrence of the event in the device 108. The outlier may be an event that may not be a part of the normal functioning of the device 108 and therefore, may not be considered in the baseline. For example, a standard deviation and probability of occurrence value may be computed for the occurrence of the event in the device 108. If the presence of a deviation is identified or the probability value is very low, at act 407, one or more event records that may be outliers may be removed. In an embodiment, the outliers may be removed also based on the probability of occurrence of the event in the device 108. Further, at act 408, the baseline associated with the event records for the group of devices is computed based on the normal events identified for the group of devices.
Referring back to
(15−10)/2=2.5σ
(e.g., the average frequency of occurrence of the event in the device 108 in real-time is 2.5 standard deviations away from the baseline for the group of devices). This enables determination of priority associated with the event records for further investigation within a risk category. At act 305, the root cause associated with the anomalous event is determined based on the risk category and the priority associated with the event records.
At act 506, it is determined if the device 108 was under maintenance when the historical event records were generated. Such determination may be made based on the time stamp associated with the historical event records. If the device 108 was under maintenance when the historical event records were generated, one or more maintenance inputs are determined from the additional information associated with the historical event records. In an embodiment, the additional information may be derived from the historical event records using natural language processing. The maintenance inputs may include, for example, details associated with interaction of the user of the device 108 with a device maintenance executive. Additionally, the maintenance input may also include any action performed by the user of the device 108 on the device 108 after the occurrence of the historical event in the device 108. Further, a period of interaction with the user of the device 108 is determined at act 507, based on the maintenance inputs. All the historical event records that are generated during the period of interaction are discarded. Therefore, only normal events associated with the device 108 are collected. At act 508, normal event records associated with the normal events occurring in the device 108 is generated based on the identified normal historical records.
A threshold value associated with the probability of occurrence may be defined based on a distribution curve associated with the real-time event records. Machine days may be the number of days of occurrence of the event in the device 108.
At act 603, a severity criteria associated with the real-time event records is determined based on the information associated with the real-time event records. The event records include information that may indicate a nature of the event associated with the event record. For example, the event record may include information such as ‘Error’, ‘Warning’ and/or ‘Information’, etc. Such information may be used to determine a severity level of the real-time event records. For each severity level, a severity score may be assigned. An embodiment of severity levels and associated severity scores is provided in the table below.
For example, ‘Miscellaneous’ level may include event records with information such as ‘Information’, ‘Success’, or event records with no information, etc. The severity score may be an indication of the severity of the real-time event record. For example, an event record with severity score 2 has greater severity than an event record with a severity score 1.
At act 604, a risk matrix associated with the real-time event records is generated. In an embodiment, the risk matrix may be a combination of probability of occurrence of the real-time event and the severity criteria associated with the real-time event. An embodiment of a risk matrix is illustrated in
An advantage of the present embodiments is that the root cause associated with an event occurring in a device may be identified efficiently. The need for manually analyzing the event records to determine the root cause of the event is eliminated. Additionally, the method enables effective prioritization of the event records for systematic root cause analysis. Further, the method enables determination of not just fatal errors in the device 108 but also minor errors that may affect the functioning of the device 108. The baseline associated with the device 108 may be considered as a gold standard of events that are expected to occur in the device 108. This enables effective segregation of bad events that may occur in the device 108 from the normal events. The method also enables consideration of event records that may have been recorded for a group of devices, thereby enabling effective resolution of the error in the device 108.
The foregoing examples have been provided merely for the purpose of explanation and are in no way to be construed as limiting of the present invention disclosed herein. While the invention has been described with reference to various embodiments, the words, which have been used herein, are words of description and illustration, rather than words of limitation. Further, although the invention has been described herein with reference to particular means, materials, and embodiments, the invention is not intended to be limited to the particulars disclosed herein; rather, the invention extends to all functionally equivalent structures, methods, and uses, such as are within the scope of the appended claims. Those skilled in the art, having the benefit of the teachings of this specification, may effect numerous modifications thereto, and changes may be made without departing from the scope and spirit of the invention in its aspects.
The elements and features recited in the appended claims may be combined in different ways to produce new claims that likewise fall within the scope of the present invention. Thus, whereas the dependent claims appended below depend from only a single independent or dependent claim, it is to be understood that these dependent claims may, alternatively, be made to depend in the alternative from any preceding or following claim, whether independent or dependent. Such new combinations are to be understood as forming a part of the present specification.
While the present invention has been described above by reference to various embodiments, it should be understood that many changes and modifications can be made to the described embodiments. It is therefore intended that the foregoing description be regarded as illustrative rather than limiting, and that it be understood that all equivalents and/or combinations of embodiments are intended to be included in this description.