The present invention relates generally to risk management and, particularly to a method and system distributed elicitation and aggregation of risk information.
Organizations are increasingly aware of the need to manage risks and uncertainties affecting enterprises through risk management solutions. Modern business processes and systems are very complex and constantly growing; even seemingly local events may have global impacts. Additionally, organizations face an increase in government requirements and regulations to demonstrate a willingness to manage these risks responsibly.
Risks can be assessed and quantified using a risk model. Risk assessment utilizes estimates of the likelihood of risk events and risk impacts, in form of probabilistic statements. For example, risk assessment could be used to analyze the likelihood of a hacker gaining access to an organization's computer network. Risk assessment could be further used to estimate the cost associated with such a security breach. The probabilistic statements are obtained from data analysis, when historical data is available, and also from expert opinion, when historical data is deemed not relevant or unavailable.
Elicitation of an expert opinion is very time consuming. Further, current methods for eliciting an expert opinion are static, in that these methods do not allow the questions posed to the expert to adapt to user responses. The elicitation process may also require significant guidance, such as face-to-face workshops or phone discussions guided by a risk analyst. Further, the elicitation process is not collaborative. Multiple experts may provide inconsistent or conflicting opinions with regards to assessment of a particular risk that need to be resolved by the analyst or decision maker to make the best of the information gathered.
Thus, there is a need in the art for an improved method and system that elicits expert opinion and is adaptable based upon the information provided by the expert. Further, the method and system may be adaptable based upon the information provided by the expert and also capable of managing inconsistent or conflicting expert opinions.
A method and system for eliciting and aggregating risk information from one or several experts is disclosed. In one embodiment, the method comprises selecting a risk network, the risk network comprising one or more risk nodes having associated risk information; assigning a role to each risk node, said role indicating a type of user to evaluate the risk node; generating a customized survey to elicit risk information for a risk node based upon the role and the user, wherein an order of questions in the customized survey presented to the user is determined by an ordering criteria; publishing the customized survey to the user; collecting risk information for the risk node from the user's answers to the customized survey; and populating the risk nodes based on the collected risk information.
In one embodiment, the system comprises a processor operable to specify a risk model, the risk model comprising one or more risk nodes, assign a role to each risk node, assign an user to evaluate each risk node, generate a customized survey based upon the role and the user, publish the customized survey to the user, collect results of the customized survey from the user, and generate a risk analysis report based on the collected results.
A program storage device readable by a machine, tangibly embodying a program of instructions executable by the machine to perform above-method steps for identifying and quantifying a risk is also provided.
Further features as well as the structure and operation of various embodiments are described in detail below with reference to the accompanying drawings. In the drawings, like reference numbers indicate identical or functionally similar elements.
A method and system for eliciting and aggregating risk information from an expert is disclosed. In one embodiment, the method comprises selecting a risk model, the risk model comprising one or more risk nodes, assigning a role to each risk node, assigning an user to evaluate each risk node, generating a customized survey based upon the role and the user, publishing the customized survey to the user, collecting results of the customized survey from the user, and generating a risk analysis report based on the collected results.
The following example applies the method and system eliciting and aggregating risk information in the context of quantifying customer satisfaction. In the following examples, “elicitation of risk information” and “eliciting risk information” from an expert is achieved by questioning the expert about a risk event. An expert is a person who has a special skill, knowledge or experience in a particular field. The expert supplies a probability that the risk event will or will not occur based upon his experience, expertise, and personal knowledge. The supplied probabilities (termed parameters of the conditional probability table) characterize the risk variables, which are also known in the art as risk nodes. Risk variables and risk nodes are often used interchangeably, and for the present application it is understood they are one in the same. There are occasions when a subset of risk variables taken from a larger set of risk variables is more important than the entire set of risk variables in the evaluation of a risk event. The subset of risk variables are sometimes known as variables-of-interest.
Referring to
Referring back to
Referring back to
The present invention utilizes a novel system and method for eliciting information from an expert. The method efficiently gathers information from experts about various risk nodes, and the elicited information can be used in a Bayesian network, such as the one shown in
In one embodiment, the surveys are generated in accordance with the “Triple-S Survey Standard.” A complete description of the Triple-S Survey Standard is maintained at http://www.triple-s.org. The Triple-S survey standard facilitates the transfer of data and metadata between survey software packages. The standard defines two text files, a “definition file” and a “data file” that describe the survey data. The “definition file” includes general information about the survey and descriptions of the survey variables, such as, for example, variable metadata. The definition file is coded in XML syntax according to rules provided by the associated Triple-S XML Document Type Definition (DTD). The data file contains the actual case data for the survey.
The survey 400 comprises three different questions 402, 404 and 406 presented via GUI (e.g., web browser) on the client computing device. A “slider” 408 is manipulated by the expert answering the survey questions to select a probability value between 0 and 1. The expert is also able to select a “confidence level” 410 associated with the probability value via the GUI. Based on the expert's responses, a “probability wheel” 412 is generated that indicates to the expert the probability distribution of the selected probability in relation to the other possible answer choices. Other questions formats (such as deterministic questions) and other screen snapshots representing summaries of the answer provided so far can be included in the survey whenever relevant.
In one embodiment, survey questions are presented to an expert in a predefined order, such as a sequential order. In another embodiment, the most important survey questions are presented to the expert at the beginning of the survey. In yet another embodiment, the survey questions are presented to the expert in a dynamic order, i.e., the response to one survey question influences which survey question will be presented next. The order of the questions is important because an expert may not answer every single question in a survey. The order of the questions is also important because certain questions may be more pertinent to evaluating a risk node. The information elicited from the most pertinent questions are sometimes known as “variables-of-interest.”
The expert opinions may be elicited through the use of a survey presented via a GUI, such as the one shown in
In an alternate embodiment, only a portion of the Bayesian network 200 is under evaluation. Therefore, only the variables relevant to the variables-of-interest need to be evaluated. However, an expert may fail to provide a value for each variable. In this instance, such an incomplete elicitation may result in a joint distribution that is approximate to the actual joint distribution over the variable-of-interest variables.
In one embodiment, the selection of the risk node selected for elicitation is based upon a criteria that captures some measure of informativeness of a partially elicited risk network, as specified in the following equation:
i*=argmin
i∈K
E[D(PZ,QZi∪K)] (1)
where
Given a set of already elicited nodes K, the selected node i* is the node that minimizes the expected distance between the true joint distribution and the approximate joint distribution that is used once that node is elicited. There are several distance metrics that can be used for D, such as the Euclidean distance, the total variation, and the Kuilback-Leibler divergence.
In the one embodiment, the i* node selected is the node with the shortest Euclidean distance between the joint distributions of PZ and QZi∪K. In another embodiment, the square of the Euclidean distance is used to select the i* node. After elicitation of the i* node, the set K is updated to include the node i*.
In one embodiment, if nodes in K are elicited, then the i node selected for elicitation is selected according to equation 2:
where
Selecting a risk node for elicitation according to equation 2 may be computationally intensive for a large Bayesian network. However, Bayesian networks composed of risk nodes elicited from experts tend to be small.
In another embodiment, the i node selected for elicitation is selected according to the number of states “s” for the non-elicited nodes remaining in the Bayesian network according to equation 3:
where
The nodes are selected because their proximity to the joint distribution of the Bayesian network makes a tradeoff between the “spread” in the selected node and the “spread” from the combination of non-elicited nodes. The selection of the node with the minimum or maximum number of states is determined by the degree to which selection of the node will reduce the spread.
In one embodiment, a Monte Carlo simulation is used to select the next node i to elicit. Given a Bayesian network with known variables Z, states and structure, variables-of-interest Y, prior distributions on all parameters, and all parameters for CPTs of elicited nodes in set k, the next node i to elicit (i*∉K) from the expert is selected according to the steps as follows:
7. Repeat steps 1 to 6 to find the next node to elicit.
The outputs of each risk node in the network are capable of functioning as inputs to another risk node. In one embodiment, the outputs of each risk model are a probabilistic distribution of an occurrence of a risk event for each risk node. The form of the outputs is consistent across the composite risk model network, and each risk node that relies on a parent risk node is consistent with the parent risk node. This consistency allows the outputs of different risk models to be combined.
Expert opinion for each risk node can be elicited and aggregated by the present invention as described below and further shown in
At step 606, an elicitation request is communicated to the risk network 501, e.g., via HTTP, initiating evaluation of all the risk nodes 500. At step 607, the elicitation request is passed on to an “elicitation administration” module 616. The “elicitation administration” module 616 generates surveys 618 that are published to the experts. As discussed above, in one embodiment, the surveys are generated in accordance with the “Triple-S Survey Standard.” A “survey generator” 612 combines a “question and answer template” 608 with “questions” 610 to generate a survey 618. In one embodiment, that survey complies with the “Triple-S Survey Standard.” The “questions” 610 are stored in a TRIPLE-S compliant data file as discussed above. An example of a survey that may be generated in accordance with the present invention is provided in
The surveys 618 are published to one or more experts at step 619. An “elicitation survey module” 624 publishes an appropriate survey to the expert assigned to a particular risk node at step 604. For example, the “elicitation survey tool” 624 presents a computer security expert with a survey designed to evaluate risk node 5001 In one embodiment, the “elicitation survey tool” 624 publishes the survey to the expert via e-mail. The survey 618 may also be communicated to the expert in other ways. The expert receives an e-mail containing a hyperlink or universal resource locator (URL) that links to an online version of the survey. Once the online survey 618 is completed by the expert, the “survey results” 630 are communicated to an “elicitation aggregator” 628 at step 625.
The “elicitation aggregator” module 628 aggregates the expert opinions for each individual risk node. In one embodiment, the expert opinions are aggregated on the basis of the confidence levels assigned to each expert opinion. For example, an expert who is “highly confident” in his opinion will receive a greater weight for his opinion in the aggregation. In another embodiment, a greater weight is given to experts who are deemed influential through peer review. As one example, experts may be asked to rate the other experts in their field, and the most highly rated expert would be considered the most influential expert. As another example, the expert with the greatest frequency of citations in journal articles related to his field could be considered the most influential expert. Peer review may also be based upon an expert's answers in other assessment exercises. In one embodiment, the expert's timeliness of answers and credibility may be used as evaluation criteria.
Referring back to
In one embodiment, “elicitation aggregator” module 628 checks for consistency among the collected expert opinions. If the answers, i.e., the probabilities supplied by the experts, differ beyond a threshold value, then additional information may be collected from the experts. For example, if two different experts were polled on the probability of rain occurring the next day, and one expert answered with a 0% probability of rain occurring, while the other expert answered with a 100% probability of rain occurring, the “elicitation analyzer” 634 would note that these answers are inconsistent with each other. In one embodiment, consistency between answers is measured by probabilities supplied by an expert not deviating beyond a threshold value. The threshold value may be set by the user requesting the risk analysis or the risk network builder.
In another embodiment, the “elicitation aggregator” module 628 checks for possible inconsistencies among the experts answers. For example, if the senior and junior computer security expert evaluate risk node 5001 and their opinions conflict on the probability of a security breach, the “elicitation aggregator” module 634 may further query the experts providing the expert opinion for a confidence level on their opinion. In another embodiment, the “elicitation aggregator” 628 may also determine if a minimum number of questions were answered in a survey by the expert to properly quantify the risk for a particular risk node. If enough questions were not answered by the expert, then the “elicitation aggregator” 628 may query the expert with additional questions, or elicit risk information from additional experts. In one embodiment, the “elicitation aggregator” 628 would require at least one high confidence answer from one of the experts to be able to assign a value to the risk node. In another embodiment, regardless of expert confidence, experts answer would be aggregated following a linear pool or a logarithmic pool with all experts having the same weight and only query for more experts if no answer is available for a given risk node.
Referring again to
In one embodiment, the “elicitation analyzer” module 634 checks for consistency among the collected expert opinions. For example, In one embodiment, consistency between answers is measured based on the difference between two experts answers on some output of the risk network (for instance, specific inference queries). If that difference is above a given threshold, experts may be asked to confirm or revise their answers. The threshold value may be set by the risk network builder.
At step 638, a risk quantification analysis is provided to the user. The risk quantification analysis is based upon evaluation of the entire risk network. A result or risk analysis report is provided to the user by the risk “elicitation analyzer” 634. The provided result may be a discrete value, a table of probability distributions, or any other output that allows the user to evaluate risk.
In one embodiment, the memory 706 of the client computer stores the “risk network tooling” 656 which is used to create the risk network and assign roles to the individual risk nodes in the risk network.
The risk server 650 comprises a processor (CPU) 712, support circuits 714, and a memory 716. The CPU 712 is interconnected to the memory 716 via the support circuits 714. The support circuits 714 includes cache, power supplies, clocks, input/output interface circuitry, and the like.
The memory 716 may include random access memory, read only memory, removable disk memory, flash memory, and various combinations of these types of memory. The memory 716 is sometimes referred to as a main memory and may in part be used as cache memory. The memory 716 stores an operating system (OS) 718, an “elicitation administration” module 616, an “elicitation survey module” 624, an aggregation module 628, an “elicitation analyzer” module 634 and survey results 632. The risk server 650 is a general computing device that becomes a specific computing device when the processor 712 operates any one of the modules 616, 624, 628 and 634.
The function of each of the modules 616, 624, 628 and 634 is discussed above with respect to the method for eliciting and aggregating expert opinion. The “elicitation administration” module 616 comprises a “survey generator” 612, “survey questions” 610 and “question and answer templates” 608. The “survey generator” 612 is responsible for generating surveys 618 from “questions” 610 and “question and answer templates” 608. The “elicitation administration” module 616 provides surveys to the “elicitation survey module” 624.
The “elicitation survey module” 624 publishes surveys 618 to experts, tracks responses to surveys, and collects “survey results” 630. The “survey results” 630 are then passed to the aggregation module 628.
The aggregation module 628 aggregates the “survey results” 630 collected from the different experts for each risk node. The aggregation module 628 may use a weighting system or a set of aggregation rules to place a greater importance on a particular expert's opinion. The “aggregated survey results” 632 are passed to the “elicitation analyzer” module 634. The “elicitation aggregator” module 628 checks for possible inconsistent answers among the experts assigned to evaluate a risk node, and if necessary elicits additional information from the experts.
The “elicitation analyzer” module 634 checks the “aggregated survey results” 632 for possible inconsistent answers among the experts assigned to evaluate a risk node, and if necessary elicits additional information from the experts.
The risk server 650 and the various modules 616, 624, 628 and 634 provides for distributed risk elicitation by publishing surveys and collecting survey answers from experts assigned to a risk node. The risk server 650 also aggregates and analyzes the collected survey answers, thus allowing risk analysis on a distributed basis.
As will be appreciated by one skilled in the art, aspects of the present invention may be embodied as a system, method or computer program product. Accordingly, aspects of the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment (including firmware, resident software, micro-code, etc.) or an embodiment combining software and hardware aspects that may all generally be referred to herein as a “circuit,” “module” or “system.” Furthermore, aspects of the present invention may take the form of a computer program product embodied in one or more computer readable medium(s) having computer readable program code embodied thereon.
Any combination of one or more computer readable medium(s) may be utilized. The computer readable medium may be a computer readable signal medium or a computer readable storage medium. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a random access memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or Flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction operation system, apparatus, or device.
A computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction operation system, apparatus, or device.
Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
Computer program code for carrying out operations for aspects of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C++ or the like and conventional procedural programming languages, such as the “C” programming language or similar programming languages. The program code may operate entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the latter scenario, the remote computer may be connected to the user's computer through any type of network, including a local area network (LAN) or a wide area network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet Service Provider).
Aspects of the present invention are described below with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the invention. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which operate via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer readable medium that can direct a computer, other programmable data processing apparatus, or other devices to function in a particular manner, such that the instructions stored in the computer readable medium produce an article of manufacture including instructions which implement the function/act specified in the flowchart and/or block diagram block or blocks.
The computer program instructions may also be loaded onto a computer, other programmable data processing apparatus, or other devices to cause a series of operational steps to be performed on the computer, other programmable apparatus or other devices to produce a computer implemented process such that the instructions which operate on the computer or other programmable apparatus provide processes for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks.
Referring now to
While the present invention has been particularly shown and described with respect to preferred embodiments thereof, it will be understood by those skilled in the art that the foregoing and other changes in forms and details may be made without departing from the spirit and scope of the present invention. It is therefore intended that the present invention not be limited to the exact forms and details described and illustrated, but fall within the scope of the appended claims.