This disclosure relates generally to decision making, and particularly to eliciting a minimal number of questions to a decision maker in order to solve a decision problem.
An influence diagram is a directed and acyclic graph for use in decision making under uncertainty. The influence diagram includes chance or random variables which specify the uncertain decision environment, decision variables which specify possible decisions to be made in a corresponding decision problem, and a utility function which represents preferences of a decision maker. Each chance variable is associated with a parent set (possibly empty) in a graph which together with that chance variable define a conditional probability distribution. A product of conditional probability distributions defines a joint probability distribution over all possible outcomes in the decision problem. Each decision variable has a parent set (possibly empty) including one or more variables whose values will be known at the time of making of corresponding decisions and may affect directly the decisions. The decision variables are typically assumed to be temporally ordered. A strategy or policy for an influence diagram is a list of decision rules including one rule for each decision variable specifying which decision to make for each value instantiation of the variables in its parent set. Solving an influence diagram is to find an optimal policy that maximizes an expected utility, i.e., achieves a goal of the decision maker.
A probabilistic decision tree (PDT) refers to a model of a decision problem that represents all choices, outcomes and paths that a decision maker may have. A main objective of building and solving a PDT is to find choices that satisfy the decision maker's situation and preference.
As an example, currently, in a health care domain, a patient considers attributes of treatment options before deciding which treatment option to undertake. Attributes include, but are not limited to: pain, disability, side effects, resulting state after a treatment, cost of a treatment, life expectancy. In this multi-attribute decision making, the patient is required to fully elicit his/her preference among all the attributes. The full elicitation from the patient can be time-consuming and cognitively difficult.
There is provided a method, a system and a computer program product for supporting a decision making process. The system receives a decision model from a decision maker. The decision model is used for determining a solution to a decision problem based on attributes and uncertainties of the decision problem. The decision problem includes information about a plurality of outcome vectors that represents all possible outcomes and the uncertainties associated with the decision problem. The decision model does not include any preference information of the attributes and the outcome vectors. The system determines whether the received decision model can be solved without receiving any preference information from the decision maker. The system receives partially specified preference information from the decision maker if the received decision model cannot be solved without any preference information. The system solves the decision model with the partially specified preference information. The system recommends, based on the solution, one or more decisions to the decision maker.
In order to receive the partially specified preference information from the decision maker, the system identifies N pairs of outcome vectors to present to the decision maker for preference assessment. The system presents N elicitation questions, based on the identified N pairs of outcome vectors, to the decision maker. An elicitation question asks the decision maker which outcome vector the decision maker prefers between two outcome vectors. The decision maker provides answers to the N elicitation questions. The decision maker's answers to the N elicitation questions represent the partially specified preference information.
These and other objects, features and advantages of the present invention will become apparent from the following detailed description of illustrative embodiments thereof, which is to be read in connection with the accompanying drawings, in which:
This disclosure describes a system, method, and computer program product for minimizing trade-off information (i.e., preference information) to be elicited from decision makers in multi-attribute decision problems with uncertainty. An example of a multi-attribute decision problem includes, but is not limited to: deciding a treatment option in a healthcare industry, where a patient needs to decide (with a help of a healthcare practitioner) which treatment option (s)he prefers while considering several attributes: pain, disability, side effects, a resulting state after applying a treatment option, death, cost, etc. In one embodiment, full elicitation of preferences over all the attributes is avoided. Full elicitation of preference over all the attributes refers to associating a cardinal value to each possible outcome vector that appears in a decision problem.
A decision problem includes information about a plurality of outcome vectors that represent all possible outcomes of the decision problem and further includes information about uncertainties associated with the decision problem. An outcome vector lists all the possible outcomes of a decision problem, e.g., pain, disability, side effects, a resulting state after applying a treatment option, death, cost, etc. Uncertainties are represented by random variables described by states and probability distributions on those states. In one embodiment, a partial elicitation of preference information is used, e.g., by partially specifying of preference over some or all the attributes and/or minimizing of elicitation of preference information from the decision maker. In one embodiment, minimized preference information elicited from a decision maker can also reduce difficulty of questions (e.g., by using outcome vectors and comparison queries). Difficulty of a question represents one or more of: user-friendliness of the question, how difficult to answer to the question, how accurately the decision maker can provide an answer to the question, etc. A comparison query refers to a query for determining a preference between two attributes.
At steps 110-115 in
Returning to
At steps 135-145, if the received decision model cannot be solved without any preference information, the computing system receives partially specified preference information from the decision maker. In one embodiment, the partially specified preference information is a minimum amount of preference information. For example, the minimum amount of preference information may specify that a user prefers longer life expectancy over a cheaper cost of a treatment option. In order to receive the minimum amount of preference information from the decision maker, the computing system identifies, based on the decision problem, N pairs of outcome vectors to present to the decision maker for preference assessment. The computing system presents N elicitation questions, based on the identified N pairs of outcome vectors, to the decision maker. An elicitation question asks the decision maker which outcome vector the decision maker prefers between two outcome vectors. The decision maker provides answers to the N elicitation questions. The decision maker's answers to the N elicitation questions represent the partially specified preference information.
In order to identify N pairs of outcome vectors, the computing system selects the N pair of outcome vectors that maximizes an expected information gain, e.g., by using programmed method steps of an algorithm “Algorithm 3” described in greater detail herein below. The expected information gain can be measured, for example, as a reduction in an expected number of undominated strategies. The computing system selects the N pair of outcome vectors that balances user-friendliness and the expected information gain. User-friendliness can be modeled based on, including but not limited to: how quickly a user can answer a question, whether a question is qualitative or quantitative, whether an outcome vector is one of existing outcome vectors of the decision problem, whether a user wants to skip answering to a question, etc. At step 145, the computing system elicits decision maker's preference based on the identified N pairs of outcome vectors. The decision maker provides the partially specified preference information by indicating for each pair of outcome vectors presented, which outcome vector(s) (s)he prefers. At step 150, the computing system solves, e.g., by using programmed method steps of an algorithm “Algorithm 2” described in greater detail herein below, the decision model with the received partially specified preference information (i.e., the identified N pairs of outcome vectors).
The computing system repeats steps 115, 120, 135, 145 and 150 until a set of recommended decisions becomes manageable by the decision maker so that the number of the recommended decisions is small enough or so that the recommended decisions are contrasted enough for the decision maker to choose.
By running steps 110, 115, 120, 135, 145 and 150, the computing system facilitates a decision making of a multi-attribute decision problem without implicit extrapolation or inference from preference information provided by a user. Preference information elicited from a user is derived from comparison queries rather than quantitative queries that ask for precise trade-off value. The computing system allows for a “cannot compare” option, i.e., an option that allows the decision maker not to answer one or several of the comparison questions presented. Method steps in
In one embodiment, in order to solve a decision problem, the computing system optimizes a number of comparison queries to be asked to a user (i.e., a decision maker, etc.). The computing system elicits, based on the optimized number of comparison queries, a minimal amount of preference information. The computing system solves, based on the elicited minimal amount of preference information, the decision problem within a pre-determined time period. The computing system outputs, as the solution of the decision problem, a set of recommended actions, which includes a list of alternatives that are not dominated, e.g., radiotherapy, active surveillance, etc. A user makes a decision based on the set of recommended actions.
In one embodiment, the decision maker inputs a maximum number of decision strategies, which (s)he can handle to be able to choose a preferred strategy without specifying more information about his preferences with respect to the multiple attributes. The decision maker inputs an upper bound on the predetermined amount of time. The decision maker preferences are elicited in a form of comparison questions, i.e., questions for asking a user of which outcome vectors (s)he prefers between two outcome vectors.
At step 210, the computing system enumerates all pairs (u, v) of outcome vectors in L. The computing system assumes that the decision maker prefers an outcome vector u over an outcome vector v and then sets a score of the pair of outcome vectors (u, v) to be the number of undominated strategies obtained by solving the decision problem under the assumption that the outcome vector u is preferred to the outcome vector v. At step 220, the computing system additionally assumes that the decision maker prefers the outcome vector v over the outcome vector u and then sets a score of the pair of outcome vectors (v, u) to be the number of undominated strategies obtained by solving the decision problem under the additional assumption that the outcome vector v is preferred to outcome vector u. At steps 225-230, the computing system sorts all the pairs of outcome vectors according to an ascending order of their scores. The computing system stores the sorted pairs of outcome vectors in a set called “CandidatePairs.”
At step 235-240, for each pair of outcome vector (u, v) in CandidatePairs, if the decision confirms that the outcome vector u is preferred over the outcome vector v, the computing system adds the pair of outcome vector (u, v) to Cone. Preferences represented by the pairs of outcome vectors in Cone are consistent if and only if for every pair (u, v) in Cone, it is not possible to infer that vector v is preferred to vector u. If consistency is violated, the computing system notifies the decision maker, e.g., by sending a text, email, etc., and moves to the next candidate pair (step 235) or generates a new pair (step 260). At step 245, the computing system solves the decision problem with the current Cone. At step 250, if the decision maker evaluates whether a solution from step 245 satisfies the decision maker. If so, the computing system stops processing. Otherwise, the computing system returns to step 255.
At steps 255-260, if the decision maker does not satisfy the solution from step 245, the computing system generates outcome vectors u and v on a boundary of the current Cone, e.g., by generating a linear combination of vectors already in the current Cone. At step 265, if the decision maker confirms that an outcome vector u is preferred to an outcome vector v (or that an outcome vector v is preferred to an outcome vector u), the computing system adds the pair of outcome vector (u, v) (or (v, u)) to Cone. At step 270, the computing system solves the decision problem with the current Cone. At step 250, the decision maker evaluates whether a solution from step 270 satisfies the decision maker. If so, the computing system stops processing. Otherwise, the computing system returns to step 255.
In one exemplary embodiment, a relatively small number of pairs of outcome vectors (e.g., 10-12 pairs of outcome vectors for a decision problem that includes ten attributes) is enough to reduce a set of undominated strategies to a few strategies (in most cases, being a singleton) so that the decision maker can actually choose a decision strategy.
In one embodiment,
Algorithm 1 method 400 performs at 420 processing of each bucket, top-down from the last to the first, invoking a variable elimination procedure that computes new probability (denoted by φ) and utility (denoted by ψ) components which are then placed in corresponding lower buckets. For a chance variable Yl (step 430 of Algorithm 1), the φ-component is generated by multiplying all probability components in a corresponding bucket and eliminating Yl by summation (at 440 of Algorithm 1). The ψ-component is computed as an average utility in that bucket, normalized by the bucket's compiled φ. For a decision variable Yl (step 450 of Algorithm 1), the computing system computes the φ and ψ components in a similar manner and eliminates Yl by maximization. In this case, a product of probability components in the bucket is a constant when viewed as a function of the bucket's decision variable and therefore the compiled φ-component is a constant as well. In
In a bottom-up step, the Algorithm 1 generates an optimal policy (at 460 of Algorithm 1). Buckets are processed in reversed order, from a first variable to a last. For each decision variable, a corresponding decision rule is generated by taking an argument of a maximization operator applied over the combination of probability and utility components in a respective bucket (as indicated at 470), for each combination of values assigned to variables in a bucket's scope while re-calling values assigned to earlier decisions. An optimal strategy, denoted by Δ, is then obtained by taking an union of the decision rules generated at step 460
The following establishes correctness of the variable elimination procedures, for example, Algorithms 1 and 2. Let it, {right arrow over (u)}, {right arrow over (v)}εRp be two utility vectors each having p number of (≧1) real-valued components such that {right arrow over (u)}=(u1, . . . , up) and {right arrow over (v)}=(v1, . . . , vp). These utility vectors may be generated by a utility function that represents preferences of a decision maker. Define a binary relation ≧ on Rp (called Pareto ordering) by {right arrow over (u)}≧{right arrow over (v)}∀i ε{1, . . . , p} ui≧vi. For finite sets U, V ⊂Rp, UV if some element of U is preferred over every element of V. Define an equivalence relation between two finite sets U, V in Rp by U≈V if and only if UV and VU. Given a finite set U ⊂Rp, define its convex closure C(U) to include every element of a form j=1k(qj×uj), where k is an arbitrary natural number, uj in U and each qj≧0 and j=1kqj=1. Given finite sets U, V ⊂Rp, define an equivalence relation ≡ by U≡V if and only if C(U)≈C(V). Therefore, two sets of multi-attribute utility vectors are considered equivalent if, for every convex combination of elements of one, there is a convex combination of elements of the other which is at least as good (with respect to the partial order on Rp). Assume Scale-Invariance and Independence properties hold (where u, v, w are arbitrary vectors in Rp):
[Independence]: If uv then u+w v+w
[Scale-Invariance]: If uv and q εR, q≧0 then q×{right arrow over (u)}q×{right arrow over (v)}. The following result also holds. Theorem 1: Let be partial order on Rp satisfying Independence and Scale-Invariance. Then, for all q, q1, q2≧0 and for all finite sets U, V, W ∪Rp, then:
q×(U+V)=q×U+q×V; (i)
(q1+q2)×U≡(q1×U)+(q2×U); (ii)
q
1×(q2×U)=(q1×q2)×U; (iii)
max(q×U,q×V)=q×max(U,V); (iv)
max(U+W,V+W)=max(U,V)+W. (v)
The following describes Algorithm 2. A decision maker may allow some trade-offs between attributes of a multi-attribute decision problem. For example, in a two-attribute situation, the decision maker may want to gain three units of the first attribute at the cost of losing one unit of the second, and hence prefer (3, −1) to (0, 0). Such tradeoffs may be elicited using a structured method, or in a more ad hoc way. Consider some set Θ of vector pairs of the form ({right arrow over (u)}, {right arrow over (v)}), where {right arrow over (u)}, {right arrow over (v)}εRp. The set Θ may include elicited preferences of the decision maker. A binary relation on Rp extends set Θ if {right arrow over (u)}{right arrow over (v)} for all ({right arrow over (u)}, {right arrow over (v)})εΘ. Similarly, extends Pareto ordering if u′≧v′u′v′.
Consider that the decision maker has a partial order over Rp, and that the decision maker specifies a set of preferences Θ. The input preferences Θ (if consistent) give rise to a relation Θ which specifies deduced preferences. A vector pair ({right arrow over (u)}, {right arrow over (v)}) can be deduced from Θ if {right arrow over (u)}{right arrow over (v)} holds for all partial orders that extend Θ, extend Pareto, and satisfy Scale-Invariance and Independence. This case is referred as {right arrow over (u)}Θ{right arrow over (v)}). This definition easily implies the following:
PROPOSITION 1: If Θ is consistent then Θ is a partial order extending Θ and Pareto, and satisfying Scale-Invariance and Independence.
Proposition 1 shows that this dominance relation Θ satisfies Scale-Invariance and Independence, giving the properties (Theorem 1) needed by a variable elimination algorithm e.g., Algorithm 2 to be correct. Suppose that a decision maker has an additional preference of (50, 12) over (0, 0), and that hence Θ includes a pair ((50, 12), (0, 0)). This may then imply, for example, that (11, 12.78) is dominated with respect to Θ by (20, 14.2), i.e., (20, 14.2) is preferred over (11, 12.78). Theorem 2 below gives a characterization of the partial order Θ. Let W be some subset of Rp. Define C(W) to be a set comprising all vectors {right arrow over (u)} such that there exists k≧0 and non-negative real scalars q1, . . . , qk and {right arrow over (w)}lεW with {right arrow over (u)}≧Σi=1kqi{right arrow over (w)}l, where ≧ is the weak Pareto relation (and an empty summation is taken to be equal to 0). C(W) is the set of vectors that the decision maker prefers over some (finite) positive linear combination of elements of W.
Theorem 2: Let Θ be a consistent set of pairs of vectors in Rp. Then {right arrow over (u)}−{right arrow over (v)}εC({right arrow over (u)}−{right arrow over (v)}: ({right arrow over (u)}, {right arrow over (v)})εΘ).
Let Θ be a finite set of input preferences as {(u′i, v′i): i=1, . . . , k}. Theorem 2 shows that, to perform a dominance test (i.e., a preference test) {right arrow over (u)}ΘV, it is sufficient to check if there exist, for i=1, . . . , k, non-negative real scalars qi such that u′−v′≧Σi=1kqi(u′i−v′i).
Alternatively, we can use the fact that the dominance test corresponds to checking whether {right arrow over (u)}−{right arrow over (v)} is in the set generated by {{right arrow over (u)}−{right arrow over (v)}: i=1, . . . , k} plus the p number of unit vectors in Rp. Therefore, Algorithm 2 that exploits tradeoffs is obtained from Algorithm 1, by replacing the +0 and max operators with +Θ and maxΘ, respectively, where maxΘ(U, V)=max Θ(U∪V), U+ΘV=max Θ(U+V), and max Θ(U) is a set of undominated (i.e., preferred) elements of finite set U ⊂Rp with respect to Θ.
One or more of these exemplary computing systems shown in
While the invention has been particularly shown and described with respect to illustrative and preformed embodiments thereof, it will be understood by those skilled in the art that the foregoing and other changes in form and details may be made therein without departing from the spirit and scope of the invention which should be limited only by the scope of the appended claims.
Any combination of one or more computer readable medium(s) may be utilized. The computer readable medium may be a computer readable signal medium or a computer readable storage medium. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a random access memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or Flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with a system, apparatus, or device running an instruction.
A computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with a system, apparatus, or device running an instruction.
Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
Computer program code for carrying out operations for aspects of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C++ or the like and conventional procedural programming languages, such as the “C” programming language or similar programming languages. The program code may run entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the latter scenario, the remote computer may be connected to the user's computer through any type of network, including a local area network (LAN) or a wide area network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet Service Provider).
Aspects of the present invention are described below with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the invention. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which run via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks. These computer program instructions may also be stored in a computer readable medium that can direct a computer, other programmable data processing apparatus, or other devices to function in a particular manner, such that the instructions stored in the computer readable medium produce an article of manufacture including instructions which implement the function/act specified in the flowchart and/or block diagram block or blocks.
The computer program instructions may also be loaded onto a computer, other programmable data processing apparatus, or other devices to cause a series of operational steps to be performed on the computer, other programmable apparatus or other devices to produce a computer implemented process such that the instructions which run on the computer or other programmable apparatus provide processes for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks.
The flowchart and block diagrams in the Figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more operable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be run substantially concurrently, or the blocks may sometimes be run in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems that perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.