The present application relates generally to analysis of a cause of a data problem during the performance of a process. The application further relates to a method and a system to perform automated root cause analysis of data problems. The application further relates to the filing and processing of missing/bad data incident reports, and the leveraging of a historical store of such incidents.
In processes or process activities which are performed at least in part by computer applications, errors often occur owing to problems with data used in the performance of the process or process activity. Such data issues may be reported in a ticketing system that may assign incidents to particular persons or assignees to fix, also referred to herein as a data problem reporting system. In addition to fixing a particular instance of a data error, which may occasionally cause malperformance of a process activity, an assignee may wish to fix a cause of the data error that gave rise to malperformance of the associated activity, e.g., to fix a root cause of the problem.
Some embodiments are illustrated by way of example and not limitation in the figures of the accompanying drawings in which:
Example methods and systems to perform automated root cause analysis are described. In the following description, for purposes of explanation, numerous specific details are set forth in order to provide a thorough understanding of example embodiments. It will be evident, however, to one skilled in the art that other embodiments may be practiced without these specific details.
According to an example embodiment, there is provided a system and method to perform root cause analysis, using one or more processors, of a data problem or data failure that may occur during the performance of a process.
The system includes at least one memory having stored thereon data dependency information that comprises, with respect to each of a plurality of entity attributes, information regarding process elements and/or process activities which contribute to the provisioning of data items which are instances of the respective entity attribute. In logical data modeling, an “entity” may be considered something of interest to an organization, and may thus be a type of thing being modeled, such as a person or a product. For example, a customer relations management (CRM) application may include the entity of “customer.” An “attribute” of an entity means something that further describes the entity, e.g., customer first name, customer last name, customer telephone number, etc.
A particular instance of an entity attribute, e.g., the name of a particular customer, stored in a memory or datastore during execution of a process, may be referred to herein as a data item. The data dependency information may thus indicate, with respect to at least a specific datastore, information regarding process elements and/or process activities which contribute to population of the specific datastore with data items which are instances of the respective entity attributes.
The system may include an issue report module to receive a data problem report indicative of the occurrence of a data problem or failure during the performance of the process, the data problem comprising unavailability or incorrectness of a problematic data item, and the data problem report including at least one descriptor to identify the problematic data item. The data problem report may also include at least one activity descriptor identifying the particular process activity or activities in which the data problem was encountered. An error which manifests as a process failure or process activity failure (e.g., generation of an invoice or an e-mail with incorrect customer information) may be caused by an underlying data problem (e.g., a problematic data item that is an incorrect customer name entity attribute in a relevant datastore from which the data item is retrieved in the execution of the process activity). The system may include a computer including a root cause analysis engine to perform automated root cause analysis based at least in part on the data dependency information and on at least one descriptor of the problematic data item, to identify at least one potential cause of the data problem. In some embodiments, identification of potential causes of the data problem may comprise identifying the particular process elements and/or process activities that contribute to the provisioning of the problematic data item, as indicated by the data dependency information.
By “process element” is meant any element involved in the performance of an associated process, including IT hardware, IT applications, human resource components, datastores, physical elements, events, and the like. The term “data” as used herein refers to any information items that a process may depend upon or utilize and is to be interpreted broadly as including master data, reference data, transaction data, event data, analytical data, meta-data, text or binary content, and the like.
An Application Program Interface (API) server 114 and a web server 116 are coupled to, and provide programmatic and web interfaces respectively to, one or more application servers 118. The application servers 118 host one or more process management applications 120. In some examples, process management may be performed with respect to the totality of an organization's activities, in which case the process management system 102 may be an enterprise management system. The application server(s) 118 are, in turn, shown to be coupled to one or more databases server(s) 124 that facilitate access to one or more database(s) 126.
The system 102 is also in communication with a process system or enterprise system 140 which supports a process that is managed by the process management system 102. Some activities of the process which are supported by the enterprise system 140 may be automated or semi-automated activities or processes executed by computers forming part of the enterprise system 140, as explained in further detail below. In some examples, the process management system 102 may provide process modeling functionality to model the process supported by the enterprise system 140, in which case the process management applications 120 may include process model application(s) (e.g., business process models (BPM)). The process management application(s) 120 may be in communication with components of an IT system of the enterprise, in particular being in communication with a number of process servers 142, 144 forming part of the IT infrastructure of the client enterprise system 140. Each of the process servers 142, 144 supports one or more process applications 146, 148, each process application 146, 148 providing functionalities employed in the performance of an associated activity or process supported by the enterprise system 140. It will be appreciated that the enterprise system 140 may typically comprise a greater number of process servers 142 and process databases 150 than those illustrated in
In the example illustrated with reference to
Each process server 142, 144 may be in communication with one or more associated database(s) or process datastore(s) 150, 152, to read and/or write associated process data to the respective process databases(s) 150, 152. The FFS server 144, and hence the FFS application 148, may, for example, be in communication with a datastore in the form of a Global Reference Data System (GRDS) 152. Although shown in
If a retrieved e-mail address in such an example process activity is incorrect, or if the relevant data item is not present in the GRDS 152, then the required notification(s) might not be sent, or might be sent to an incorrect address, which would constitute malperformance of the process activity and would thus result in a process failure. The cause of this example process failure is the absence or incorrectness of the relevant data item or attribute value.
Data may be provided to the GRDS 152 by one or more process activities and/or datastores external to the GRDS 152. Such provisioning of data to the GRDS 152 may, for example, include data transfer, data update, and/or data synchronization between the GRDS 152 and other system database(s) 126 such as, for example, a customer relationship management (CRM) datastore, an accounting datastore, an asset management datastore, a human resources (HR) datastore, and the like. Each of these uploads, transfers, and/or synchronizations may be executed or managed by a respective process application 146.
Some data items may instead, or in addition, be provided to the GRDS 152 by one or more data gathering activities, in which attribute values or data items are entered into the system 140 by manual user input. A data problem with respect to the GRDS 152, such as explained above with respect to the sending of incorrect e-mail notifications, may be caused by non-performance of user input into GRDS 152 at the time of new customer account provisioning, or may be caused by an error in one of the databases 160, 162 which provision the GRDS 152, or may be caused by malperformance or nonperformance of, for example, an updating, transferring, or synchronization activity performed by a respective application 164, 168. A user providing input at, for example the time of new customer account provisioning, also constitutes a “process element” as used herein.
The process management application(s) 120 may provide a number of functions and services to users that access the process management system 102, for example providing analytics, diagnostic, predictive and management functionality relating to system architecture, processes, and activities of the enterprise supported by the enterprise system 140. Respective modules for providing these functionalities are discussed in further detail with reference to FIG. 2 below. In the present description, for clarity of describing mechanisms providing pertinent functionality, the mechanisms will be described in terms of various “modules.” These modules may be implemented in software, firmware or hardware, but the description of different modules does not mean or in any way suggest that the mechanisms that provide the described functionality are separate from one another in any way. For example, the various “modules” might all be implemented in software, through executable instructions stored in a single machine-readable mechanism, with no separation whatsoever as to the functionality provided by the separate instructions. While all of the functional modules, and therefore all of the process management application(s) 120 are shown in
Further, while the client-server system 100 shown in
The web client 106 accesses the process management application(s) 120 via the web interface supported by the web server 116. Similarly, the programmatic client 108 accesses the various services and functions provided by the process management application(s) 120 via the programmatic interface provided by the API server 114.
The process management system 102 may therefore provide a number of modules to facilitate automated root cause analysis of the problems or data failures during the performance of a process. The process management application(s) 120 may thus include an incident filing module or issue report module 204 to facilitate the filing and enhancement of data failure reports or data problem reports, also referred to herein as incident tickets. Each data problem report is indicative of the occurrence of a data problem during the performance of a process, such as for example the incorrectness or unavailability of a data item with respect to an e-mail address to be used by the FFS application 148 (
The data problem report may further include a failure type identifier to identify an associated type of data problem and thereby to specify the nature of the problem. The failure type identifier may, for example, indicate: that a data item, such as an attribute value or entity information, is missing; that an attribute value is outdated, or that an attribute value is incorrect. The data problem report may, in the case of an incorrect or outdated attribute value, include a suggested value for the incorrect or outdated data item.
The data problem report may yet further include at least one activity descriptor or activity identifier to identify a particular activity of the process during which the associated data problem occurred or was encountered. The activity descriptor may identify an activity explicitly, for example indicating that the data problem occurred during account generation, shipment notification, or the like. Instead, or in addition, the at least one activity indicator may indicate a particular application that encountered the data problem during its duration, for example identifying the FFS application 148. In a further embodiment, the at least one activity indicator may instead, or in addition, identify a particular employee, employee machine, or location at which the data problem was encountered. In such an embodiment, the system may include information mapping performance of particular activities to corresponding employees, employee machines, and/or locations, and the method may comprise automatically identifying the activity during which the data problem was encountered with reference to such information, based on the employee, employee machine, and/or location indicated in the data problem report. In some embodiments, physical infrastructure dependency information 307, HR dependency information 306 and the IT system dependency information 304 (see
The issue report module 204 may provide a user interface, typically a graphical user interface (GUI), to facilitate the filing of data problem reports. The issue report module 204 may effectively force a user who submits or generates a data problem report or incident ticket to provide information identifying the problematic data item. The data problem report may also include at least one activity descriptor to identify a particular process activity in which the associated data problem occurred or was encountered. To this end, the issue report module 204 may limit the entry of information with respect to the problematic data item to values selected by the user from a predetermined list of options, and/or by requiring the entry of information with respect to at least a minimum number or set of data fields in order for the data problem report to be lodged. In an example embodiment, the GUI provided by the issue report module 204 may include drop-down menus for respective data fields, the user's entry options with respect to such data fields being limited to the options provided in the drop-down menus. The drop-down menus provided with respect to the different data fields may be interrelated and may be dynamically context-sensitive. When, for example, the user selects an entity identifier for a particular entity from a drop-down menu with respect to entity identifiers, a drop-down menu with respect to an attribute identifier may display a list of options limited to the entity corresponding to the selected entity identifier. The issue report module 204 may populate such option lists or drop-down menus based on an enterprise data model (EDM) 340 (see
The process management application(s) 120 may further include a root cause analysis engine 208 to perform and/or facilitate automated root cause analysis of data problems or data failures. The root cause analysis engine 208 may perform root cause analysis to identify potential root causes of the data problem represented by the problematic data item identified in the data problem report, based on data dependency information 308 with respect to the relevant process activity managed by process management applications 120, and/or based on information regarding earlier data problem reports in the form of historical incident records 352 stored in a data issue repository 350 (see
The EDM 340 further includes an attribute list 344 in association with the entity list 342. The attribute list 344 provides a set of attributes associated with each of the entities listed in the entity list 342. It will be appreciated that different entities have different associated attributes. For example, the set of attributes which apply to the entity “Customer” will be different from a set of attributes which apply to an entity “Order.” Each entity in the entity list 342 may therefore have a corresponding set of attributes in the attribute list 344.
The data dependency information 308, and in particular the data flow dependency information 313 may be linked to the attribute list 344, thus providing a set or listing of process elements and/or process activities on which provisioning of the corresponding data items are dependent. The data flow dependency information 313 may include not only process elements and/or process activities which contribute directly to the flow or provisioning of the respective entity attribute into an associated datastore, but may also include process activities and/or process elements which contribute indirectly to the availability and/or correctness of the associated entity attribute.
By “process element” is meant any element of the process system, including IT hardware, IT applications, human resource components, datastores, physical elements, events, and the like. The data flow dependency information 313 may thus include a listing of not only hardware components such as databases, servers, software applications, communication links, and the like, that contribute to the flow of data items that are instances of the corresponding attribute in the attribute list 344, but may also include a listing of process activities or events that contribute to the flow of such data items into a specific datastore. With reference to the example embodiment illustrated in
Referring to
Referring back to
To this end, the root cause analysis engine 208 may include a data issue query module 209 to interrogate the data issue repository 350. The data issue query module 209 may compare one or more descriptors (such as an entity identifier and/or an attribute identifier) included in the data problem report to the data item descriptors 358, and may identify similar incident record(s) 352 based on similarity between the descriptors of the data problem report and the data item descriptor(s) 358 of the corresponding incident record(s) 352. In response to identifying an incident record 352 matching the data problem report, the data issue query module 209 may provide to the user the RCA results 354 and/or the remediation script 356 corresponding to the matching incident record 352. The issue query module 209 further provides query services functionality that comprises a set of services which provide programmatic access to the EDM 340 and to the data issue repository 350. A user may thus access the EDM 340 and the data issue repository 350 to view, edit, and/or enter data therein.
The process management application(s) 120 may also include a remediation module 218 to retrieve a corresponding remediation script 356 from the data issue repository 350 in the event that the root cause analysis engine 208 identifies an incident record 352 matching the data problem report. The remediation module 218 may in such case execute the remediation script 356 to fix or alleviate the root cause of the data problem.
A GUI module 200 may be configured to provide a management console for the administration of the EDM 340 and the data issue repository 350. A user or administrator may thus add, modify, or delete entities or entity attributes in the EDM 340; create, update, or delete entity-attribute mappings in the EDM 340; and update the data issue repository 350 with previously identified root causes of data problems, together with corresponding remediation scripts. The process management application(s) 120 may administer the EDM 340 and/or the data issue repository 350 in an automated and dynamic fashion to enforce consistency in the EDM 340 in response to the addition, modification, or deletion of EDM entries, so that, for example, deletion of a particular entity from the EDM 340 automatically results in the deletion of entity attributes in the EDM 340 corresponding to the deleted entity. Such automatic and dynamic data management may also be enforced system-wide. For example, relationships between logical process model information 310 and dependency information 302 may automatically be harmonized. Changes to a logical process model 312, to physical infrastructure dependency information 307, to HR dependency information 306, and/or to IT system dependency information 304 may for example automatically result in corresponding changes in data dependency information 308 in general, and the data flow dependency information 313, for example, in particular. These components are described in greater detail below with reference to
A report module 224 may be provided to generate reports with respect to incidents recorded in the data issue repository 350. Such reports may be generated in response to user requests. The report module 224 may report and analyze incidents logged in the data issue repository 350, thereby assisting an enterprise in identifying underlying process issues that may cause data problems, rather than fixing incidents on an ad hoc basis only. Such reports may also assist in identifying and fixing shortcomings or design flaws in existing processes that may cause data problems.
The system 102 may include process modeling functionality to build and/or edit the process model with respect to a process supported by the enterprise system 140. Process model information with respect to such a process may be used to provide the data dependency information that indicates, with respect to each entity attribute, process elements and/or process activities that contribute to the provisioning of data items. To this end, the application(s) 120 is shown to include at least one default process model module 216 to provide default process models. In instances where the process model is in respect to a business enterprise, the default process model module 216 may provide default business process models (BPM) which are to serve as bases for a user to define a business process model specific to the enterprise system 140. The default BPM's may be predefined by a supplier of the business process management application(s) 120 and are in respect to generic business processes relating to a variety of types of businesses or types of business activities. A user may thus, as a starting point for defining an enterprise-specific BPM, select one or more default process models which most closely approximate the business processes performed by the enterprise system 140. The default process model module 216 may typically provide default logical process models indicating a series of activities, without specific operationalization information indicating particular process elements or support elements on which the activities are dependent. The term “logical process model” refers to the depiction, specification, or mapping of a series of activities of an associated process, excluding process operationalization elements, e.g., IT system components, human resource information, and data dependency information. The term “process” as used herein comprises a series of activities to produce a product or to perform a service, and is to be interpreted broadly as including a process group, a sub-process, or any collection of processes. Therefore, the totality of activities and/or processes which may be performed in an enterprise may also be referred to as a process. In instances where the process model information is therefore with respect to an enterprise, such as a business enterprise, the process model information may thus be in the form of an enterprise model.
A model building/editing module 206 may be provided to enable a user or administrator to define an enterprise-specific process model, either by editing, adapting, or building on a selected default enterprise model, or by building an enterprise model from scratch. The model building/editing module 206 also enables the editing of the enterprise model in response to changes in the enterprise system 140 or the associated processes. As mentioned above, such an enterprise model is a process model which may represent sequences and relationships of business processes, business process activities, as well as relationships of such business process activities to information technology (IT) infrastructure, process applications 146, 148, and process data. The process model information may comprise , at least: a logical process model defining a plurality of activities forming part of the process, the logical process model specifying relationships between the respective activities; IT system dependency information indicative of dependency of respective activities on associated IT system elements, the IT system dependency information including datastore dependency information indicative of one or more datastores which may be accessed in execution of respective activities; and data dependency information indicative of dependency of process activities on data in the one or more datastores which may be accessed in execution of respective activities. As is explained in greater detail below with reference to
The process management application(s) 120 may include a data integration module 222 to integrate information with respect to data dependency contained in various databases 126. For example, the data integration module 222 may be configured to integrate the process model information with the EDM 340 and with the data dependency information 308. In one embodiment, the data integration module 222 may automatically compile some aspects of the data dependency information 308 based on relevant related aspects of the data dependency information 308 and the logical process model information 310.
The process management application(s) 120 may further include a data gathering module (not shown) to gather and collate information regarding the performance of respective processes and/or activities. To this end, the data gathering module may cooperate with monitoring applications (not shown) installed in each of the process servers 142, 144 and/or client machines (not shown) forming part of the enterprise system 140. The system 102 may thus gather and record information regarding activities performed by respective elements forming part of the enterprise system 140. A data event such as data synchronization, data collation, or data transfer between two data repositories may be logged or recorded to facilitate tracking or monitoring of performance of the associated business activities, and to facilitate the identification of exceptions in such process logs which may indicate non-performance of a process activity that potentially may give rise to a particular problematic data item or data problem. Further data which may be gathered may include error data generated in response to unscheduled unavailability of applications or infrastructure elements.
The logical process model 312 references failure definitions 314 which may include service-level agreements 316 and key performance indicators 318. The failure definitions 314, SLAs 316, and KPIs 318 may be user-specified.
It will be appreciated that the logical process model information 310 and the dependency information 302 together provide process model information (or enterprise model information) defining a process architecture for the enterprise system 140, the process architecture comprising, on the one hand, the processes and activities defined by the logical process model 312, and, on the other hand, information on the operationalization of the processes and activities as defined by the dependency information 302.
Thus, the databases 126 may include dependency information 302 in process dependency repositories, the dependency information 302 comprising structured information regarding dependencies of respective processes and/or process activities of the enterprise model. The dependency information 302 includes IT system dependency information 304 that comprises information regarding process dependency on IT system elements of the enterprise system 140. The IT system dependency information 304 may thus include information regarding dependency of processes or activities on software such as process applications 146, 148, as well as dependency on IT infrastructure. In this regard, IT infrastructure refers to the configuration and arrangement of hardware forming part of the enterprise system 140. IT infrastructure information may thus include the properties, statuses, configuration, and relationships of hardware components such as particular servers, machines, and/or interfaces in the enterprise system 140. The term IT system includes the IT infrastructure and software or process applications 146, 148 supported by the IT infrastructure. The IT system dependency information 304 also includes datastore dependency information indicative of relationships between respective activities and datastores which are accessed in performance of the respective activities. As used herein, datastore dependency is distinct from data dependency. Datastore dependency is concerned with whether or not a particular datastore is available and/or operational, while data dependency is concerned with the availability and/or quality of data in an operational and available datastore. In other words, data dependency relates to the availability and/or quality of data in a datastore, assuming that the datastore is fully operational. Thus, for example, the failure of a server on which a datastore is hosted, or the failure of a data link to a datastore, will be related to datastore dependency. In contrast, for example, the absence of particular required records or data fields in a datastore, even when the datastore is fully operational; the quality of data in the datastore; the failure of data transfers into the datastore; or the failure of data synchronization between the datastore and another datastore will be related to data dependency.
The IT system dependency information 304 enables the generation of an interactive GUI displaying those process applications and process servers on which a selected process or process activity is dependent.
The dependency information 302 may further include human resources dependency information 306 in which is stored structured information regarding the dependency of respective processes or process activities on particular human resource components, such as people or personnel. The HR dependency information 306 may for example specify the job role or personnel department responsible for the performance of a particular process activity.
Physical infrastructure dependency information 307 may also be included in the dependency information 302 to indicate the dependency of respective process activities on physical infrastructure components. Such physical infrastructure components may include, for example, vehicles, machinery, supply-chain elements, buildings, and the like.
The dependency information 302 also includes data dependency information 308. The data dependency information 308 may include data quality dependency information 309 and data availability dependency information 311. The data quality dependency information 309 indicates dependency of process activities on quality of data in respective datastores, such as the databases 150 and 152 (
The data availability dependency information 311 is indicative of dependency of process activities on the availability of data in the one or more datastores which may be accessed in execution of respective activities. The data availability dependency information 311 may, for instance, include data flow dependency information 313 indicative of dependency of one or more direct datastores (that is, datastores which may be directly accessed during performance of the associated process activities) on associated process elements for data flow into the respective datastores. The data flow dependency information 313 may therefore be indicative of process elements contributing to the flow of data into one or more direct datastores, as well as dependency of respective process activities on the flow of data into the respective direct datastores. The data flow dependency information 313 may be with respect to process elements which contribute directly or indirectly to data flow into respective direct datastores, and may thus include information regarding data flow into indirect datastores. In other words, the data flow dependency information 313 may comprise information regarding process elements, such as process applications, process servers, personnel, and/or business processes or activities which contribute to the flow of data into respective datastores accessed during performance of associated activities/processes, and upon which such activities/processes are therefore dependent for the availability and/or quality of data. It is to be appreciated that explicit dependencies or datastore dependencies are defined as part of the IT system dependency information 304, while data flow dependencies are defined as part of the information 308. As used herein, “explicit dependency,” or “direct dependency” of an activity means that an associated process element contributes directly to performance of the activity, and is to be distinguished from data dependency. Consider, for example, an activity that is performed by an application which accesses a particular datastore during execution of the application, while data in the particular datastore is, e.g., periodically synchronized with a master datastore. In such case, the activity will have a direct or explicit dependency on the particular datastore which is accessed during execution of the application, and will be data dependent on the master datastore, in particular being data flow dependent on the master datastore. The term “data dependent” means that a particular process element contributes to the availability and/or quality of data in general or of a particular data element, such as an entity attribute, in a datastore, and that failure or absence of the particular process element may affect the availability and/or quality of data in the datastore. Likewise, the term “data flow dependent” means that a particular process element contributes to the flow of data into a particular datastore, and that failure or absence of the particular process element may affect the flow of data into the particular datastore. In this regard a process element may include, for example, a data source, an IT infrastructure component, a process application, a process event, a human resources component, or the like. The term “datastore” means any repository or memory on which data is stored, and may include internal memory forming part of a device contributing to performance of activity, as well as external databases.
It is to be appreciated that, in the above example, the activity will not be datastore dependent on the master datastore, so that the relationship between the activity or application and the master datastore will not form part of datastore dependency information as a subset of IT system dependency information 304 with respect to the activity, but will be included in data dependency information 308 with respect to the activity. In particular, the relationship between the activity or application and the master datastore may in such case form part of the data flow dependency information 313, being a subset of the data availability dependency information 311.
The provision of the data availability dependency information 311 permits the identification or prediction of failure or unavailability of a particular IT infrastructure element or process application not only on processes or process activities which are directly dependent on the failed IT infrastructure element or process application, but also on processes or activities which are not directly dependent on the failed element or application, but which are dependent on the failed element or application for the flow of data into datastores which are accessed directly during execution of the process or activity.
The data availability dependency information 311 may further include data element dependency information 315, which comprises information regarding dependency of respective activities on particular data elements in the one or more datastores which may be accessed in execution of respective activities. Such data element dependency information 315 may thus, for example, indicate particular data items, such as entity attributes, on which respective activities are dependent. The data element dependency information 315 may be in respect of dependency on a particular attribute for execution of a process activity in general. In an invoicing activity, for example, data element dependency information 315 may indicate that the process activity is dependent on the presence or availability in the associated datastore of a value for the client account code attribute.
Root cause analysis for a data problem encountered during the performance of a process activity may be performed based on data dependency information contained in dependency information 302 forming part of the process model information. The data dependency information 308 may be analyzed to identify which of the listed data dependencies was reported as the encountered problem. Once a match is found, root cause analysis may be performed based on the data flow dependency information 313 to identify the process elements whose failure may have caused the data problem.
It will be appreciated that the logical process model information 310 and the dependency information 302 together to provide process model information (or enterprise model information) defining a process architecture for the enterprise system 140, the process architecture comprising, on the one hand, the processes and activities defined by the logical process model 312, and, on the other hand, information on the operationalization of the processes and activities as defined by the dependency information 302.
The process management system 102 further comprises historical data 320 indicative of past performance of processes defined in the logical process model 312, as well as being indicative of the latest state of process elements and data in respective datastores. The historical data 320 may preferably be gathered in real-time or near real-time, optionally being gathered upon performance of the respective processes or process activities. Instead, or in combination, the historical data 320 may be gathered at predefined times or intervals. Historical data 320 may include applications failure history 322 indicative of failure of process applications 146, 148, as well as IT infrastructure failure history 324 indicative of past failure of IT infrastructure elements, such as process servers 142, 144. The historical data 320 may further include physical infrastructure failure history 327 with respect to failure of physical infrastructure elements, such as vehicles, machinery, and the like. Human resource performance history 323 may also form part of the historical data 320 to provide information regarding historical performance of particular human resource components such as personnel, personnel departments, operational units, and the like. The historical data 320 may also include data flow history 332, which comprises historical information with respect to the flow of data elements into respective datastores forming part of the enterprise system 140. The data flow history 332 may, for example, include process activity logs for updating and/or synchronizing activities performed by the updating application 168 and the synchronizing application 164, respectively, of
As illustrated in
An exemplary method will now be described with reference to
The method 600 commences when an incident ticket is received at operation 604. Such an incident ticket is often filed by a user (but may also be automatically generated by a program) in response to a process failure manifested by the malperformance of a process or process activity. As used herein, malperformance of a process or process activity may include failure to perform the process or activity, as well as incorrect performance of the process or activity. Thus, for example, a client of a freight forwarding service provided by the enterprise system 140 may file an incident ticket, at operation 604, when the FFS application 148 (
The incident ticket may be analyzed, at operation 608, to identify a problematic data item associated with the process failure. Such analysis may be performed by a support analyst, who may identify that the process failure (manifested, for example, in the non-transmittal of a shipping notice) is caused by a data problem comprising an incorrect or missing data item. For example, the support analyst may identify that the failure to send the shipping notice, which is the subject of the incident ticket, was caused by failure of an account reference lookup by the FFS application 148 in the GRDS 152 (both of
The support analyst may thereafter enhance the incident ticket with one or more descriptors at operation 612 to generate a data problem report upon which automated root cause analysis may be based. The support analyst may, for example, associate with the data problem report an entity instance identifier that indicates a particular entity associated with the problematic data item, in this example being “Customer X.” An attribute identifier may further be included in the data problem report to indicate a particular attribute of which the problematic data item is an instance. Thus, in the present example, the data problem report may include an attribute identifier such as “customer.attr_account_ref.” The support analyst may yet further attach to the data problem report a failure type identifier to identify an associated type of data problem. In the current example, the failure type identifier may be “Missing Value.” The data problem report may also include an activity descriptor identifying a particular process activity in which the problematic data item resides or should have resided. The data problem report of the present example may thus include an activity identifier or descriptor indicating “shipment notification”.
The enhancements to the incident ticket, at operation 612, to generate the data problem report, which includes the various descriptors, may be performed by the support analyst by means of a GUI generated by the GUI module 200 and/or the issue report module 204 (both of
Enhancement of the incident ticket may, however, include the provision of a suggested value for the problematic data item. Thus, for example, in instances where the data problem is caused by an incorrect value for a particular instance of an attribute, and where the support analyst (optionally, with guidance from the user) is aware of the correct value, the support analyst may enhance the ticket, at operation 612, by entering the correct value in the data problem report. Such a suggested value may be used in correction of the data problem, for example in the manual fixing of the problematic data item, at operation 640, as is described in greater detail below.
The issue report module 204 may enforce enhancement of the incident ticket with at least one descriptor by not allowing closing and lodging of the data problem report without user selection of at least one descriptor. In some embodiments, completion of the data problem report may be dependent on at least one mandatory descriptor, for example requiring the selection of an attribute identifier.
After the data problem report, also referred to herein as the enhanced incident ticket, is closed, the data issue query module 209 may automatically interrogate the data issue repository 350 to identify similar earlier data problems, at decision operation 616. To this end, the data issue query module 209 may compare the descriptors included in the data problem report to data item descriptors 358 in the data issue repository 350 to identify potentially matching incident records 352. If a similar earlier incident is identified, at decision operation 616, the method 600 proceeds to decision operation 636, as described further below.
If, however, no similar earlier incidents are identified at decision operation 616, automated root cause analysis is performed at operation 620. In this example, automated root cause analysis comprises automated identification of potential causes of the data problem. Identification of potential causes of the data problem may comprise identification of a set or listing of processes, process activities, and/or process elements which contribute to providing the problematic data item to the relevant datastore. In the present example, the root cause analysis provides a listing limited to potential problematic activities, being process activities that contribute to the providing of the problematic data item for shipment notifications that are data dependent on data items in GRDS 152. In other examples, however, the root cause analysis results may also include process elements, such as datastores, human resource components, IT hardware components, IT software components, and the like.
The root cause analysis may comprise extracting from the data flow dependency information 313 a listing of the process activities associated with the particular entity and attribute indicated by the data problem report. The example data problem report, which reports a data problem owing to the absence from the GRDS 152 of the entity attribute “customer.attr_account_ref”, may therefore comprise a listing of process activities that includes an updating activity performed by the updating application 168, and a synchronizing activity performed by the synchronizing application 164. It will be seen that the data flow dependency information 313 includes not only process elements and/or activities which contribute directly to the provisioning of the relevant datastore, but also process elements and/or activities which contribute indirectly to providing the associated data item to the datastore. For example, the customer master datastore 162 and the synchronizing application 164, together with its associated synchronizing activity, do not directly deliver an account reference attribute to the GRDS 152, but contribute indirectly thereto by their involvement in the provision of the account reference attribute to the CRM database 160 by synchronization between the CRM database 160 and the customer master datastore 162.
The results of the root cause analysis is thereafter automatically assessed, at decision operation 624, to determine whether or not the listing of potential problematic activities include more than one activity. If the activity count equals one, then the problematic process activity is recorded, at operation 632. If, however, the activity count is greater than one, the support analysts may analyze the data problem report and the RCA results, at operation 628, to identify and record a particular one of the process activities included in the RCA results which is the cause of the data problem, i.e., which is the problematic activity. In cases where none of the potential problematic activities suggested in the RCA results is the actual root cause of the data problem, the support analyst may analyze whether or not any of the existing processes or activities need to be enhanced, and whether or not new processes and/or activities need to be introduced to avoid similar data problems in future.
It is thereafter considered, at decision operation 636, whether or not the problematic activity is automated. If the problematic activity is not an automated activity, then the problematic data item may be fixed manually, at operation 640. If, for example, the problematic process activity is a manual data input activity in which data is provided by a user directly to the GRDS 152, the problematic data item having been inputted incorrectly or having been omitted from input, then the analyst may fix the problematic data item, at operation 640, by manual input or correction of the relevant data item into the GRDS 152.
If, however, the problematic activity is determined at decision operation 636 to be an automated activity, then process logs for the problematic activity may be parsed, at operation 644, to identify an exception that may indicate malperformance of an instance of the problematic activity potentially causing the data problem, such as a database exception indicated by an exception code in the corresponding process audit log. In the present example, parsing of process logs for the updating activity performed by the updating application 168 may, for example, identify an exception with respect to the updating of the account reference attribute of the customer to whom shipping notification was not sent. The correct attribute value (e.g., the value for “attr_account_ref”) may be retrieved from the logs and the problematic activity may be re-triggered, at operation 648. Re-triggering of the problematic activity may, in some examples, be performed automatically, while, in other examples, the re-triggering of the problematic activity may be an optional operation. Such re-triggering of the problematic activity achieves correct performance of the activity whose malperformance caused the data problem, and therefore promotes the presence in the GRDS 152 of data items provided by a malperformed or failed instance of the problematic activity.
A remediation script, or multiple remediation scripts, may be executed, at operation 652, to fix the process failure caused by the data problem. In the present example embodiment, the remediation script may effect the transmission of the shipping notification that was not sent owing to the incorrect account reference attribute in the GRDS 152. The remediation script(s) may be generated by the analyst. Instead, if a matching incident record 352 in the data issue repository 350 was identified, at operation 616, then the remediation script 356 corresponding to the matching incident record 352 may be retrieved from the data issue repository 350 and may be executed to remedy the process failure.
The data issue repository 350 is thereafter updated, at operation 656, to reflect the reported incident. An appropriate incident record 352 may thus be lodged in the data issue repository 350, together with corresponding RCA results 354, remediation script(s) 356, and data item descriptors 358 included in the data problem report.
Finally, the user may be notified, at 660, that the data problem reported in the incident ticket or data problem report has been fixed.
The example method 600 described above thus facilitates and supports the fixing or remediation of not only a particular data problem, but also facilitates the fixing or remediation of an underlying process or activity that might have caused the particular data problem or data failure. In some embodiments, a data issue may be automatically fixed. The identification of a root cause of the data problem is facilitated by automated root cause analysis, which provides the support analyst with a list of potential problematic activities. Integration between the EDM 340 and the issue report module 204 promotes the reporting of data problems in a structured form that enforces terminological consistency.
The example computer system 700 includes a processor 702 (e.g., a central processing unit (CPU) a graphics processing unit (GPU) or both), a main memory 704 and a static memory 706, which communicate with each other via a bus 708. The computer system 700 may further include a video display unit 710 (e.g., a liquid crystal display (LCD) or a cathode ray tube (CRT)). The computer system 700 also includes an alphanumeric input device 712 (e.g., a keyboard), a cursor control device 714 (e.g., a mouse), a disk drive unit 716, a signal generation device 718 (e.g., a speaker) and a network interface device 720.
The disk drive unit 716 includes a machine-readable medium 722 on which is stored one or more sets of instructions 724 (e.g., software) embodying any one or more of the methodologies or functions described herein. The software or instructions 724 may also reside, completely or at least partially, within the main memory 704 and/or within the processor 702 during execution thereof by the computer system 700, the main memory 704 and the processor 702 also constituting machine-readable media.
The instructions 724 may further be transmitted or received over a network 726 via the network interface device 720.
While the machine-readable medium 722 is shown in an example embodiment to be a single medium, the term “machine-readable medium” should be taken to include a single medium or multiple media (e.g., a centralized or distributed database, and/or associated caches and servers) that store the one or more sets of instructions 724. The term “machine-readable medium” shall also be taken to include any medium that is capable of storing, encoding or carrying a set of instructions for execution by the machine and that cause the machine to perform any one or more of the methodologies described herein. The term “machine-readable medium” shall accordingly be taken to include, but not be limited to, solid-state memories, optical and magnetic media, and carrier wave signals.
Thus, a method and system to perform analysis of a process supported by a process system have been described. Although the system and method have been described with reference to specific example embodiments, it will be evident that various modifications and changes may be made to these embodiments without departing from the broader spirit and scope of method and/or system. Accordingly, the specification and drawings are to be regarded in an illustrative rather than a restrictive sense.
The Abstract of the Disclosure is provided to comply with 37 C.F.R. §1.72(b), requiring an abstract that will allow the reader to quickly ascertain the nature of the technical disclosure. It is submitted with the understanding that it will not be used to interpret or limit the scope or meaning of the claims. In addition, in the foregoing Detailed Description, it can be seen that various features are grouped together in a single embodiment for the purpose of streamlining the disclosure. This method of disclosure is not to be interpreted as reflecting an intention that the claimed embodiments require more features than are expressly recited in each claim. Rather, as the following claims reflect, inventive subject matter lies in less than all features of a single disclosed embodiment. Thus the following claims are hereby incorporated into the Detailed Description, with each claim standing on its own as a separate embodiment.