This invention relates to a multi-dimensional, long-term behavior, computerized decision method and system. More particularly, the invention relates to providing computer operations to perform routine decisions based on the historical performance of experts in the decision process.
There are many routine decision tasks where a person receives input data from a computer and analyzes the data to come to a decision. In some cases, these decision tasks are multi-dimensional in input and cannot easily be grouped into a relatively small number of classes.
Because the decisions can not be so grouped, the decision process does not lend itself to automated or expert processes provided in traditional recognition or classification problems.
Further for many of these decision problems, it is difficult to measure the quality of each individual decision made by a person, but it is possible to measure the integral quality of a plurality of decisions as a whole made by the person over a period of time. This integral quality can be compared against other persons making similar decisions over a period of time and the relative expertise of each decision maker can be measured.
One example of routine decision tasks based on input data as discussed above is in retail goods allocation tasks. In such tasks a distribution or allocation expert reviews input data on a computer display screen a quantity of retail goods to be allocated in various quantities from warehouses to multiple retail stores selling the goods. Where there are 10 to 1000+stores in the business and the quantity of goods to be allocated to each store varies from 0 to 100+, the number of possible allocation outcomes can easily exceed several thousand. Such a decision problem is so multi-dimensional it does not lend itself to automated solution based on recognition and classification systems.
Further, to measure the quality of an allocation by examining a specific allocation is not meaningful. For example, if a specific set of goods such as swimsuits is allocated to certain stores and turns out not to be profitable for those stores, this result may be due to weather conditions rather than lack of experience by the allocator. On the other hand, if over an entire season all the goods allocated by this same allocator generate the highest total profit or other metric, this same person might be recognized as an expert allocator. In other words, there is no absolutely right or wrong decision for each decision problem, but there are the best (expert) and the poorest decision makers.
Another problem in computerizing routine decision tasks of the above type is that best practices in the environment of the decision problem may change over time. For example, in the allocation of retail goods, business practices may change over time because of changes to the competitive environment or changes in the goals of the business entity.
In accordance with this invention an expert decision-making method is trained to emulate expert behavior based on a history of behaviors by experts in a variety of observed situations. A history of behaviors is built up from observations of actions taken by experts in analyzing a plurality of situations. The observations are captured, and behaviors from the observations are constructed. The behaviors indicate an association between situation features and methods with parameter values for solving the situations.
A training method captures observations of behavior by experts. The observations include situation data about multiple situations and actions by the experts. The actions are associated with the situations. Subject knowledge information is loaded from the observations; the subject knowledge information has a features library, a method library and a parameters library. Behavior information is constructed from the observations and from the subject knowledge information; the behavior information includes situation features and strategies associated with the behaviors for solving the situation. A behavior profile is learned from the behaviors. The behavior profile is used in emulating the behavior of the experts during a decision-making process.
The construction of behaviors begins with extracting situation features from the situation data. Strategy information including behavior methods and parameters for solving situations is extracted from the expert actions in the observations information and from the methods library and the parameters library in the subject knowledge information. The situation features are associated with the strategies as the behavior information. The extraction of strategy information begins by comparing a actions/situation combination from the observations information with method/parameters combinations from the subject knowledge information. A method/parameters combination previously associated with the situation/action combination is selected and provided as a strategy for solving the situation.
The invention may be implemented as a computer process, a computing system or as an article of manufacture such as a computer program product or computer readable media. The computer program product may be a computer storage media readable by a computer system and encoding a computer program of instructions for executing a computer process. The computer program product may also be a propagated signal on a carrier readable by a computing system and encoding a computer program of instructions for executing a computer process.
These and various other features as well as advantages, which characterize the present invention, will be apparent from a reading of the following detailed description and a review of the associated drawings.
In the present invention, expertise is derived from past experience of experts making decisions on the type of situation data being processed. The decision-making process itself is empirical as the experts have learned what method choices applied to the situation data representing a situation provide a best plan of action to solve the situation The choice of method to solve the situation may or may not include a choice of parameters to be used with that method. The expert in working with situation data, that has multiple dimensions as discussed herein, has become an expert heuristically, i.e. by trial and error, and developed an extensive personal knowledge base and historical background to analyze and solve problems provided by the situation data.
For example, in merchandise allocation an allocator may spend six months or a year learning how to effectively allocate retail goods to a chain of stores. As the merchandise allocator becomes more expert, the profitable allocation of retail goods improves for the stores. Further, some merchandise allocators seem to have more talent for doing the allocations than other allocators. Accordingly, an expert emulating computer system emulating the behavior of the most talented allocators can be extremely valuable to a large retail company. Such a computing system provides expert guidance to new merchandise allocators who are not yet expert, or provides expert guidance to lower performing allocators.
In the example of a merchandise allocation system the usage cycle runs on demand when an allocator requests allocation of situation data, such as goods to be allocated to a set of stores in a chain of retail outlets. The complete loop of the training cycle in this example might run only once a week or even once a year or more. The profile which contains allocation methods and other allocation data is normally adjusted for long term changes in the allocation process. When a profile is first created by the training cycle, it can be expected that adjustments and new allocation observations will be frequent and the training cycle will run frequently, i.e. daily or even hourly. After the profile has been updated for several weeks, the adjustments to a solution and new observation information regarding behavior of an expert will be minimal. The profile has settled down, and the updates to the profiles by the training cycle will be come weekly, monthly or even quarterly on the calendar.
The usage cycle begins with retrieval operation 110 which retrieves the situation data selected by the user and provides the situation data to the expert behavior emulator 112. The expert emulator 112 analyzes the situation data and makes a solution recommendation to the user.
The user reviews the solution recommendation, and adjust/save operation 114 may or may not adjust the solution based on input from the user. After a final solution is reached and approved by the user, the solution data is saved by adjust/save operation 114 to the customer database as a solution for the situation data being analyzed. The usage cycle is described in more detail hereinafter with reference to
The training cycle begins with retrieval operation 116 which retrieves observations regarding actions taken by the expert when dealing with particular situation data. This situation data and actions information are used by the behavior construct operation 118 to construct expert behaviors. From these expert behaviors, profile create operation 120 creates the profiles for use by the expert behavior emulator 112 in the usage cycle. The training cycle is described in more detail hereinafter with reference to
Alternatively, retrieval operation 116 and behavior construct operation 118 might be performed for all users, and decisions about which users are experts might be made later at profile create operation 120. This allows for better flexibility in case of a need to change a user assignment to expert or non-expert. Because all behaviors are collected in this alternative embodiment and it is not necessary to start retrieve operation 116 and construct behaviors operation 118 from the beginning.
The customer site 202 also has the customer's server 210, which maintains the customer's database of situation data. A customer network 209 at the customer site interconnects the customer's server with supplier's dynamic data server 208 and the customer's workstations 212. The supplier also has an administrative laptop 214 connected to the customer network. Thus a user works at the workstations 212, and the user's work is assisted in the methodology of recommending situation solutions in accordance with the usage cycle discussed above for
Computer storage media includes volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data. Computer storage media includes, but is not limited to, RAM, ROM, EPROM, flash memory or other memory technology, CD-ROM, digital versatile disks (DVD) or other optical storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium that can be used to store the desired information and that can be accessed by the computing devices in
Communication media typically embodies computer-readable instructions, data structures, program modules or other data in a modulated data signal such as a carrier wave or other transport mechanism and includes any information delivery media. The term “modulated data signal” means a signal that has one or more of its characteristics set or changed in such a manner as to encode information in the signal. By way of example, and not limitation, communication media includes wired media such as a wired network or direct-wired connection, and wireless media such as acoustic, RF, infrared, and other wireless media. Combinations of any of the above should also be included within the scope of computer-readable media. Computer-readable media may also be referred to as computer program product.
The logical operations of the various embodiments of the present invention are implemented (1) as a sequence of computer implemented acts or program modules running on a computing device and/or (2) as interconnected machine logic circuits or circuit modules within the computing device. The implementation is a matter of choice dependent on the performance requirements of the system implementing the invention. Accordingly, the logical operations making up the embodiments of the present invention described herein may be referred to variously as operations, structural devices, acts or modules. It will be recognized by one skilled in the art that these operations, structural devices, acts and modules may be implemented in software, in firmware, in special purpose digital circuit logic, and any combination thereof without deviating from the spirit and scope of the present invention as recited within the claims attached hereto.
The drawings in
The behavior emulator module 308 than works with the situation data 306, the profile 310, and the subject knowledge information 312 to recommend a situation solution to the user at the workstation. The profile is a generalization of observed decisions made by experts, where a single decision is a sequence of methods with or without parameter values that have in the past been used by experts as solutions in observed situations. The subject knowledge includes a methods library, a parameters library and a features library. The methods library includes set of expert behavior methods used by the experts in the past. A method includes a method name, a list of required input parameters and a method execution code, i.e. the method operations. The parameters library includes the parameter lists and limitations for the parameters in the methods; it also includes parameter calculation rules. The features library includes the features of the situation data and the feature calculation rules. The solution recommended by the behavior emulator will be solution data indicating a plan of action and remainder data indicative of situation data remaining to be processed. The sequence of methods producing the recommended solution along with parameter values for the methods is referred to herein as a strategy.
The recommended solution from and the strategy used by the behavior emulator are both displayed by display operation 316 to the user at the workstation computer screen. A typical display would identify each method used along with its parameter values, the situation solution data and remainder data at each stage or iteration of the strategy, i.e. a strategy iteration refers to the execution of one of the methods with its parameters in a sequence of methods making up a strategy. The next iteration works on the remainder data with the next method in the sequence. The strategy is completely executed when the remainder data represents a value within a predetermined range deemed acceptable for a solution to the situation.
For example in the case of merchandise allocation, the display of each iteration of the strategy would identify a method used along with the parameter values used, the solution data, i.e. the number of goods allocated to each store, and the remainder data, i.e. the number of goods unallocated. In merchandise allocation, the strategy will not be complete usually until the remainder is zero.
When the recommended solution is displayed, the user is given an opportunity to adjust the strategy or accept the strategy. If the strategy is accepted, accept test detects the indication from the user, and the operational flow branches to transfer operation 320. Transfer operation 320 transfers the recommended solution into the customer's database. Normally, transfer operation 320 does not transfer an entire strategy to the customer system, because a strategy includes behavior steps that lead to the solution. Alternatively, the transfer operation could transfer the strategy as well as the solution. The transfer operation always transfers the recommended solution. In the case of merchandise allocation, the allocation for goods, i.e. the solution, would be transferred into the customers database and the goods would be distributed accordingly.
If the user decides to adjust the strategy, the accept test detects that adjustments are made and the operation flow passes to receive adjustments operation 322. Adjust strategy operation receives the adjustment input from the user and uses that input to modify the strategy being used by the behavior emulator. The receive operation 322 also generates a corrections log 324 to record the strategy adjustment for subsequent use in a training cycle. After the adjust strategy operation, the operation flow returns to the behavior emulator to re-execute the strategy which has now been modified. The modified strategy 314 is used by the behavior emulator module 308 along with the subject knowledge and the profile to recommend a new situation solution. Depending on input from the user, the behavior emulator module 308 may just change one method in the strategy and execute the other methods in the strategy unchanged. Additionally, the behavior emulator may be instructed by the user that it is allowed to adopt other methods in the strategy subsequent to the method changed by the user.
Recognize method module 404 receives the situation features from feature extract operation 402, and it receives the method choices from the subject knowledge 312 and features/methods separation data from the profile 310. Using pattern recognition techniques in multi-dimensional space for all the situation features, and using the features/methods separation data, the recognize method module selects a method to be executed against the situation data.
The operation flow for the recognize method module 404 is shown in
In the example of merchandise allocation, the selected method might be an initial allocation method, and the initial quantity of retail goods and retail stores would be the situation features. Save method operation 506 retrieves the operations for the selected method and saves the method operations 405 in working storage for subsequent use by execute method module 408 in
Recognize parameters module 406 in
Get operation 602 in
Save operation 606 saves the detected parameters calculation rule. A rules algorithm might require more parameters calculation rules for its execution, but some rules algorithms do not require any more parameters calculation rules and can be executed without any additional rules. If last saved rule requires parameters calculation rules, then additional rules test operation 608 branches the flow to detect operation 604, and operations 604 and 606 repeat to detect parameters calculation rule for this rule algorithm. If last saved rule does not require additional parameters calculation rules, the operation flow branches to the execute operation 610. The execute operation executes all saved parameters calculation rules. Save operation 612 saves parameter values obtained on the previous step, and the operation flow returns to execute method module 408 in
Execute method module 408 in
The method retrieve operation 702 in
In the merchandise allocation example the solution data will be the quantity and type of goods allocated to each retail store, and the remainder data will be the remaining quantity and type of goods not allocated by the selected method. Update operation 710 saves the current method, its parameters, the solution data and the remainder data, and the operation flow returns to remainder test operation 410 in
In
If the remainder is not in an acceptable target range, the operation flow returns to recognize method module 404. Recognize method module 404, recognize parameters module 406 and execute method module 408 operate as described above except that now the situation features are limited to those represented by the remainder data. The operational flow stays in an operational loop through modules 404, 406, 408 and test operation 410 until R is in an acceptable target range. This completes the description of the operation flow in the usage cycle shown in
The training cycle of
Load operation 806 receives the observations 804 and extracts unique methods used, parameter lists and limitations, and features from those observations and loads the extracted information into the subject knowledge information 312. Load operation could be manual at the beginning and made semi-automated as methods, parameters and features become familiar. The subject knowledge information 312 as described in the usage operational flow includes the methods library, the parameters library, and the features library. These libraries are used by the behavior construction module 810 to construct behavior information 812. Behavior information includes features and strategies as described above in reference to the usage cycle. The behavior construction module 812 is described in more detail hereinafter with reference to
After the behavior information has been constructed it is provided to the learning module 814. The behaviors include situation features and strategies. The learning module works with the features and strategies to create the profile 310 of expert decisions for use by the behavior emulator. After the learning module creates the profile 310, the training cycle is complete. The operation of the learning module is described hereinafter with reference to
Feature extraction operation 902 receives the situation data from observations 804 and the feature names and features calculation rules from the features library in the subject knowledge 312. The feature extract operation 902 uses the feature names and feature calculation rules to extract the situation features from the situation data.
The strategy extraction module 904 receives the actions data from the observations information 804 and it receives the method choices from the methods library and the parameter limitations for parameters in the methods from the parameters library in the subject knowledge 312. From this information the strategy extraction operation determines a strategy associated with an action. The strategy is defined by methods and their parameter values. The strategy extraction module 904 is described in more detail hereinafter with reference to
In
The compare operation 1006 compares the situation/actions information with the method/parameters information and passes the results of the comparison onto the select operation 1008. Based on the historical association between the situation/actions and a method/parameters combination, the selection operation 1008 selects a behavior method along with its parameters to be used with the situation/actions combination by the user. The method and parameter values selected in operation 1008 are then saved as one strategy iteration (method and parameter values) as strategy information 907 (
After the method/parameter selection is saved, actions ended test 1012 detects whether there are more actions in the processing of the situation that need to be evaluated. If there are more actions, then the operation flow returns to retrieve operation 1002, which retrieves the next situation/actions combination from the observations information. Compare operation 1006 then finds the best behavior method/parameter combination to be used with the situation/actions combination. Select operation 1008 then selects the best method and parameters for the situation/actions combination and save operation 1010 saves that as the next method/parameters iteration in the strategy information 907. This cyclic loop continues until all actions have been handled by the strategy extraction operational flow. When all actions have been processed, the operational flow returns to the load behaviors operation 906 in
In
While the invention has been particularly shown and described with reference to preferred embodiments thereof, it will be understood by those skilled in the art that various other changes in the form and details may be made therein without departing form the spirit and scope of the invention.
This application is related to commonly assigned, U.S. application Ser. No. ______, entitled “MULTI-DIMENSIONAL, EXPERT BEHAVIOR-EMULATION SYSTEM” by ______ and concurrently filed herewith.