1. Technical Field
The invention relates to the organization and viewing of information. More particularly, the invention relates to a methodology for efficiently transforming large or complex decision trees into compact, optimized representations to ease viewing and interaction by a user.
2. Discussion of the Related Art
A decision tree is a structure composed of nodes and links in which the end nodes, also called leaf nodes, represent actions to be taken, the interior nodes represent conditions to be tested on variables, and the branches represent conjunctions of conditions that lead to actions.
A decision tree can represent a series of decisions that divide a population into subsets. The first decision divides the population into two or more segments (i.e. partitions). For each of these segments, a second decision divides the segment into smaller segments. The second decision depends on the choice for the first decision, and the third decision depends on the choice for the second decision, and so on. In other words, a decision tree can be used to segment a dataset in such a way that an action node of the tree corresponds to that segment of the dataset for which all the conditions along the branch from root to that action node are satisfied.
A leveled decision tree is one where the variables appearing along each branch are always in the same order. A decision tree is said to be ‘read once’ when no variable appears twice along any branch.
Depending on the decision process being modeled, a decision tree can be extremely complex, having many variables, many values for those variables, and many outcomes depending on the various combinations of the variables and their associated values.
A sample leveled decision tree is shown in
Information presented in the form of a decision tree becomes difficult to comprehend and visualize when the tree is large. This invention relates to a method for finding an optimal ordering of the variables of the decision tree, and using that ordering to convert the tree to an optimal Directed Acyclic Graph, i.e., a “DAG,” such that the DAG represents the same information found in the tree but with the smallest possible number of nodes compared to any other ordering of variables.
A DAG is a Directed Graph with no cycles or loops. A Directed Graph is a set of nodes and a set of directed edges, also known as arcs or links, connecting the nodes. The edges have arrows indicating directionality of the edge.
Tree representations are comprehensible ‘knowledge structures’ when they are small, but become more and more incomprehensible as they grow in size. Comprehensibility of the knowledge structure is a critical issue since, ultimately, humans must work with and maintain such structures.
The reason why trees often become large is the repeated occurrence of identical subsets of conditions interspersed throughout the tree. This phenomenon is called “sub-tree replication.”
Others have attempted to resolve the problems associated with sub-tree replication. Ron Kohavi, in his research paper, “Bottom-up Induction of Oblivious Read-Once Decision Graphs,” published in the “European Conference on Machine Learning,” 1994, introduced a new representation for a decision tree, called the “Oblivious read-once Decision Graph,” i.e., the “OODG.” He also described a method to convert a tree representation to the OODG representation. However, Kohavi's method chooses the ordering of constituents in an ad hoc manner, which fails to ensure the resulting representation will have the least number of nodes.
Brian R. Gaines, in his research paper “Transforming Rules and Trees into Comprehensible Knowledge Structures,” suggests an alternative representation for general decision trees called the “Exception-based Directed Acyclic Graph,” i.e., an “EDAG.” An “exception” is a clause that has been put alongside a condition, such that, if the condition fails to be satisfied, this clause, called an exception, shall be assumed to be the conclusion for the decision tree. Gaines, however, also fails to address the issue of variable ordering.
Steven J. Friedman and Kenneth J. Supowit, in their research paper, “Finding the Optimal Variable Ordering for Binary Decision Diagrams,” published in “IEEE Transactions on Computers,” Vol. 39, No. 5 in May 1990, discuss a method for finding the optimal variable ordering where the variables of the decision tree are restricted to having only two values, namely true or false. The representation thus restricts the segmentation of a branch into at most two branches by checking on a given condition to be either true or false. This method is intended to be used in the design of electronic circuits only where the outcome of a “decision” is binary. The method cannot be applied directly to a general decision tree where the variables cannot be restricted to the binary values of either true or false.
Given the limitations of the prior art, there exists a need for a method to create an understandable representation of complex decision trees that is both comprehensible by a human user and is computable in practically feasible amounts of time.
The invention comprises a method for transforming a large decision tree into a more compact representation to ease use and interaction by a human user. More particularly, the invention comprises a method for transforming an input decision tree into an optimal compact representation by computing a particular ordering of variables in the decision tree that first leads to a Directed Acyclic Graph, or “DAG,” with a minimum number of nodes. The method then converts the DAG into a comprehensible exception-based DAG, or “EDAG,” with exactly one exception. The optimized EDAG presents the same information and knowledge structure provided in the original input decision tree, but in a more compact representation.
The method of the invention includes the following novel features:
The method of the invention is described via an illustrative example dealing with a decision tree used in determining whether to issue a credit card to an applicant or determining what type of credit card to issue to an applicant.
The exemplary input decision tree of
The following detailed description of the invention follows the steps of the method shown in the flow diagram of
Although shown in relation to a decision process for determining the type of credit card to issue to an applicant, the present invention is applicable to creating optimal compact representations of other types of tree structures used for other purposes. For example, the present invention can be used to optimize or analyze, among others, dynamic decision trees for predicting outcomes; for determining optimal paths associated with a project plan consisting of various predetermined procedures; for assessing the true and proximate causes of a particular chain of historical events; and, for other knowledge structures which may lend themselves to initial representation in a tree structure.
As shown in
Step 1: Convert the input decision tree to a collection of independent Decision Chains having the same length: This entire Step 1 corresponds to Box 2 in the flowchart of
The method of the invention also “normalizes” all the conditions of each variable level used in the decision chains so that every condition is as simple and as unambiguous as possible. The “normalization” procedure may significantly increase the number of decision chains, which also increases memory requirements and computational complexity. The normalization procedure, preserving mutual exclusivity, is described in more detail below. An additional embodiment of the invention is described later which provides an alternative method to minimize memory requirements and computational complexity.
As shown in
Steps to “normalize” i.e., preserve mutual exclusivity, conditions for a variable level: The following steps normalize the conditions for a variable level:
Step 2: Construct an ordered power set: This Step 2 corresponds to Box 3 in the flowchart of
Step 3: Compute an Optimal Model for each element from S having cardinality of one: This Step 3 is shown as Boxes 6 and 7 in the flowchart of
This newly created structure is the Optimal Model for element C, having a cardinality of one, which is placed into memory storage for further use. The method then computes the Optimal Model for all other elements of cardinality one, and stores each result in memory. For the example case, optimal models for {Income}, {Job}, and {Assets} are each placed into memory storage.
Description of the Merging Procedures
Redundant nodes of variable v are merged to keep the resulting graphic structure as compact as possible, according to two primary methods:
Merging also takes place in the decision chains above nodes of variable v. Multiple occurrences of exactly identical decision chains, i.e., the upper, still un-merged parts, are removed; and outgoing arcs are drawn to the all of the children of removed parts.
Step 4: Compute an Optimal Model for all element sets with cardinality greater than one: This Step 4 is shown as Boxes 9 to 17 in the flowchart of
Step 5: Select Optimal Model and restore single root node: Step 5 is shown as Box 18 in the flowchart of
Step 6: Select the Global Exception and create the EDAG: This Step 6 is shown as Box 19 in the flowchart of
Consequently, the original input decision tree has been converted to an optimal, compact representation in the form of an optimized and minimized exception-based directed acyclic graph where the entire knowledge structure of the original input decision tree has been retained. Having thus developed an optimized, compact representation, a user can more easily comprehend and interact with the knowledge structure.
Low Memory Method: The method described in Steps 1 though 6 uses a decision chain set structure such that the merged variables are represented as a DAG and un-merged variable levels are represented as decision chains. Because of the normalization procedure described earlier, decision chains can have many redundant nodes and edges, leading to high memory requirements. Consequently, in an additional embodiment of the invention, we describe a low memory method that represents un-merged variable levels as a DAG instead of decision chains. This method is similar to the method described above in Steps 1 through 6, except for the following changes:
In Step 1, the input decision tree is converted to a directed acyclic graph, i.e., a “DAG,” as follows:
The DAG structure reduces memory consumption, but adds the additional requirement of performing “local” normalization and simplification steps during each level manipulation. In step 3, when moving nodes of variable v to just above leaf nodes and in step 4, when moving nodes of variable v to just above the nodes of the variables in C′ (v), the movement has to take place one level at a time, rather than movement directly across multiple levels, as is done in the first embodiment of the invention described above. Local normalization and simplification proceed according to the following steps:
When applied in practice, the invention significantly simplifies the graphic structure displayed to a user. For example,
As the size and complexity of the input decision tree increases, the impact of the application of the invention is even more dramatic. For example,
As can be seen by the above description and presentation of the tangible results associated with the invention, the invention provides a useful and computationally tractable means by which complex and confusing decision trees can be transformed into a smaller, simpler, and more comprehensible format, which will ease understanding and utility of the knowledge represented by the initial input decision tree.
Although the invention is described herein with reference to the preferred embodiment, one skilled in the art will readily appreciate that other applications may be substituted for those set forth herein without departing from the spirit and scope of the present invention. Accordingly, the invention should only be limited by the Claims included below.
The present application claims priority to related provisional patent application Ser. No. 60/823,618 filed Aug. 25, 2006, entitled “Strategy and Ruleset Visualization,” by inventors Stuart Crawford, Gaurav Chhaparwal, Kashyap KBR, Navin Doshi and Sergei Tolmanov, which is not admitted to be prior art with respect to the present invention by its mention in the background. This application is incorporated herein in its entirety by this reference thereto.
Number | Name | Date | Kind |
---|---|---|---|
5528735 | Strasnick et al. | Jun 1996 | A |
5537630 | Berry et al. | Jul 1996 | A |
5546529 | Bowers et al. | Aug 1996 | A |
5603025 | Tabb et al. | Feb 1997 | A |
5608898 | Turpin et al. | Mar 1997 | A |
5623541 | Boyle et al. | Apr 1997 | A |
5644686 | Hekmatpour | Jul 1997 | A |
5682487 | Thomson | Oct 1997 | A |
5692107 | Simoudis et al. | Nov 1997 | A |
5696885 | Hekmatpour | Dec 1997 | A |
5701137 | Kiernan et al. | Dec 1997 | A |
5701400 | Amado | Dec 1997 | A |
5710896 | Seidl | Jan 1998 | A |
5720007 | Hekmatpour | Feb 1998 | A |
5742836 | Turpin et al. | Apr 1998 | A |
5745712 | Turpin et al. | Apr 1998 | A |
5787416 | Tabb et al. | Jul 1998 | A |
5796932 | Fox et al. | Aug 1998 | A |
5806056 | Hekmatpour | Sep 1998 | A |
5815415 | Bentley et al. | Sep 1998 | A |
5818155 | Kawamura et al. | Oct 1998 | A |
5822745 | Hekmatpour | Oct 1998 | A |
5870559 | Leshem et al. | Feb 1999 | A |
5870768 | Hekmatpour | Feb 1999 | A |
5875431 | Heckman et al. | Feb 1999 | A |
5890131 | Ebert et al. | Mar 1999 | A |
5917492 | Bereiter et al. | Jun 1999 | A |
5920873 | Van Huben et al. | Jul 1999 | A |
5930764 | Melchione et al. | Jul 1999 | A |
5953017 | Beach et al. | Sep 1999 | A |
5953707 | Huang et al. | Sep 1999 | A |
5958008 | Pogrebisky et al. | Sep 1999 | A |
5966126 | Szabo | Oct 1999 | A |
5966695 | Melchione et al. | Oct 1999 | A |
5974127 | Wernli et al. | Oct 1999 | A |
5982370 | Kamper | Nov 1999 | A |
5987242 | Bentley et al. | Nov 1999 | A |
5999192 | Selfridge et al. | Dec 1999 | A |
6014138 | Cain et al. | Jan 2000 | A |
6073138 | de l'Etraz et al. | Jun 2000 | A |
6078324 | Phathayakorn et al. | Jun 2000 | A |
6088693 | Van Huben et al. | Jul 2000 | A |
6089453 | Kayser et al. | Jul 2000 | A |
6094654 | Van Huben et al. | Jul 2000 | A |
6108004 | Medi | Aug 2000 | A |
6111578 | Tesler | Aug 2000 | A |
6112202 | Kleinberq | Aug 2000 | A |
6134706 | Carey et al. | Oct 2000 | A |
6137499 | Tesler | Oct 2000 | A |
6237499 | McKoy | May 2001 | B1 |
6249768 | Tulskie, Jr. et al. | Jun 2001 | B1 |
6285366 | Ng et al. | Sep 2001 | B1 |
6292830 | Taylor et al. | Sep 2001 | B1 |
6327551 | Peterson et al. | Dec 2001 | B1 |
6396488 | Simmons et al. | May 2002 | B1 |
6411936 | Sanders | Jun 2002 | B1 |
6646652 | Card et al. | Nov 2003 | B2 |
6738736 | Bond | May 2004 | B1 |
7000199 | Steele et al. | Feb 2006 | B2 |
7346529 | Flores | Mar 2008 | B2 |
20020147626 | Zagotta et al. | Oct 2002 | A1 |
20030069869 | Gronau et al. | Apr 2003 | A1 |
20040039619 | Zarb | Feb 2004 | A1 |
20040073442 | Heyns et al. | Apr 2004 | A1 |
20040107131 | Wilkerson et al. | Jun 2004 | A1 |
Number | Date | Country |
---|---|---|
0690367 | Jan 1996 | EP |
0717346 | Jun 1996 | EP |
0770967 | May 1997 | EP |
0978989 | Feb 2000 | EP |
9320510 | Oct 1993 | WO |
9512161 | May 1995 | WO |
9721171 | Jun 1997 | WO |
Number | Date | Country | |
---|---|---|---|
60823618 | Aug 2006 | US |