The present invention relates broadly to a processor-based negotiation methods, and more specifically to computer readable media and systems for automated negotiation.
Market transactions typically include negotiations between buyers and sellers. Such negotiation may involve reiterative offers and counteroffers relating to the terms of the sale, e.g., price, delivery, warranty, payment options, and product features. Given the relative complexity of such negotiations, trying to automate such a negotiation process presents a significant challenge.
In the gaming world, many contests between parties have been successfully automated. In fact, an entire industry of automated games has been developed. Even traditional games have been successfully automated, e.g., chess, checkers, tic-tac-toe, etc. One technique for automating games is to use computer gaming strategy.
Computer gaming strategy often involves building a “game tree” that allows a computer to process the options presented as the game progresses in order to select options at each stage that will most likely result in a win. Other attempts to automate games have included heuristic approaches that attempt to emulate human behaviour. The game tree strategy, however, uses available computing power to produce comparable or better results than the heuristic models.
A game tree generally includes nodes representing the various states, stages or positions, that may occur during the game. A root node represents the current state. Under each root node, a child node represents an option or move that is available from the root node. When there are no further options available, the node becomes a “terminal node”. Expanding the game tree to all its terminal nodes allows all possible options to be represented and considered.
Negotiations can present several complexities not normally found in a standard game or contest. For example, several variables or terms can be all negotiated at once, such as price, payment systems, delivery date, product specifications, etc. Some such variables may not present finite, discrete options or moves, e.g., as in price, volume, time of delivery, etc. Considering all such continuously variable options would require an infinite number of branches on a game tree.
Such complexities make using computers difficult in automating business negotiations or other contests having similar continuously variable attributes. For example, discretization to too fine a granularity makes the game trees too bushy because there are too many leaves. It can become impractical to evaluate each leaf. Discretization to too coarse a level can result in an optimum being missed.
Briefly, a processor-based method for automated negotiation with continuous games moves includes constructing a game tree with a root node that represents a current state of a negotiation; defining a range-term based on continuous game moves; treating a range-term as a single continuous variable in said game tree; halting expansion of a branch of said game tree at a range-term node for which only a range-term was changed in a move leading to said range-term node; and evaluating said game tree to select a next offer with an optimization process routine to determine an optimum payoff value for each range-term node.
In typical negotiations, the goal is to negotiate favorable variables, and correspondingly, not to enter into a bad deal. A negotiation may be viewed conceptually as a filling in of the blanks of a contract. Each blank corresponds to a variable of the deal. As each such variable is agreed upon, the corresponding blanks in the contract are filled. When all variables have been collected, the contract is complete and a deal has been made. If an automated process of negotiation is to proceed efficiently, variables cannot be re-negotiated once agreed to. Thus, certain constraints on the negotiation may be imposed in order to make the automated process efficient.
In a step 106, the offer validation begins and may include a determination of how the offer should be categorized and ultimately how to respond appropriately. For example, such validation may include identifying the opponent making the offer, and the subject matter of the offer, to ensure the offer relates to the current deal being negotiated. The process may further determine whether the offer constitutes a legal offer, as defined by the specific automated negotiation process. For example, step 108 determines whether the offer is a legal offer or if the offer violates some constraint. For purposes of automating the negotiation process, certain constraints may be placed on the negotiation process.
If the offer does not violate any of the constraints and thus qualifies as a legal offer, method 100 continues to step 110. In step 110, a game tree is constructed based on the offer received. Such constitutes the root node, from which the game tree is expanded via different levels of parent/child nodes. The game tree generally includes nodes representing the various states of the negotiation. The construction of a game tree is discussed in more detail in reference to
In step 114, a counteroffer is selected. Generally, the counteroffer will be based on a new state of the negotiation represented by the child node under the current root node which has the highest payoff of the available child nodes, having the highest payoff is chosen from the available counteroffers. The selected counteroffer can then be presented as a response to the opponent in step 116. Once the counteroffer has been presented to the opponent, method 100 may loop back to step 104 to await the receipt of another offer from the opponent in response to the counteroffer. Otherwise, the process may end at step 120. The negotiation process continues reiteratively until all variables of the deal have been completed, e.g., there are no more counteroffers available.
In method 100, a legal counteroffer is one that narrows the range of potential deals by eliminating or narrowing a variable of the deal. Such guards against getting trapped in a never-ending loop during the automated negotiation process. If an offer does not narrow or eliminate a variable, the overall negotiation is not advanced to a conclusion. There can be no valid counteroffer, and so the offer is rejected in a step 118. When an offer has been rejected, a response is presented to the opponent in the negotiation in step 116. In this case, the response would be a rejection of the illegal offer. Once a response has been presented to the opponent, method 100 may continue to step 104 to await the receipt of another offer, or the process may end at step 120.
In the example embodiment, nodes that involve changing only range terms are marked and the process of constructing that portion of the game tree omitted. In particular, as the creation of the child and sibling nodes progresses in the game tree construction 200, nodes will develop in which only the range term is being modified in all subsequent nodes or moves. When this occurs, the pattern of nodes from the child is essentially identical to the pattern from the parent node. It has been recognized that since there is no way to predict how many rounds of offers may occur after that point, there is no benefit or need to construct an unknown quantity of repetitive nodes under the child node. Rather, the child node is marked as a range-term node and the game tree construction progresses with the creation of nodes in other branches. An optimization process routine is then used during the later evaluation process to determine the payoff value to be assigned to such range-term nodes.
The negotiation for which the game tree 200 is constructed and shown in
The construction of game tree 200 in
To illustrate this point, a further node branch labelled “P” and a node 206 are shown in an area 207. Node 206 represents a price offer from the other party to the negotiation further modifying the range term “P” on the branch from node 204.
In varying embodiments, any repeating branches are truncated, e.g., nodes 203 and 205 are simply marked as range-term nodes. So area 207 and node 206 would not exist. A further branch-expansion attempt however may be needed that produces node 206 in order to see that it should not have been expanded.
A node 216 can be identified and created as a child node to the root node, node 201. Child node 216 represents the combined moves of selecting wingtips and modification of the range term, e.g., the price in the example embodiment. Such combined move is represented on the branch to node 210 as a flag “W,P”. A node 211 is created as a child beneath Node 210 since the price can continue to be changed. Accordingly, a flag “P” is shown on the branch between nodes 210 and 211. From this point, the range-term can continue to be modified by the parties indefinitely. Conventionally, an indefinite number of nodes could be created beneath node 211 which would all essentially represent the same state as node 211. So, rather than repeating nodes beneath node 211, the construction of this branch of the game tree 200 is halted. Node 211 is simply marked as a range-term node.
From root node 201, only the price counteroffer could be responsible for a move, as to a node 212. Such represents a counteroffer where only the price term has been modified, as indicated by the “P” flag. Price is a range-term in this example embodiment, so multiple paths can be taken from node 212. These options are illustrated in an area 213 in
Game tree 200 is expanded during an evaluation process for optimization process routines to assign payoff values to each of the nodes. Further construction in each branch of the tree 200 is abbreviated whenever range-term nodes result in identical child patterns. Such pruning strategy can be formulated by stopping expansion of the tree once a node is reached for which in the prior move leading to that particular node only the range-term was changed.
Importantly, game tree 200 does not include any terminal nodes. Moreover, the game tree 200 has only range-term nodes as its leaves. In prior art approaches, such game tree construction is only finished when all of the terminal nodes have been completed. But such is open-ended because there is an infinite number of possible increments for the continuous variables.
An optimization process routine is used in the evaluation of game tree 200. The range-term nodes are extended for different assumed changes of the range-term in the next and subsequent moves until terminal nodes are reached and the payoff at the terminal nodes can be computed. Such payoffs are propagated back to the range-term node where the result is used to determine a new assumed change in the range-term as part of the optimization process.
For each of the range-term nodes 302, 304, 306, 308 and 309, the evaluation process involves the logical extension of the game tree under the respective range term nodes, assuming an initial change in the range-term. The logical extension proceeds until terminal nodes have been reached, and the payoff value is computed for each terminal node, where all variables, including the range-term, are now fixed. The payoff value is determined in the example embodiment using utility functions of the parties involved. Once the payoff value for each terminal node has been calculated, the value for each non-terminal node can be determined. The value for each non-terminal node is that of the child with the largest value for the party that can move to it. In that way, the values of the terminal nodes are propagated to the range-term nodes. At the range-term nodes, the payoff is used to calculate a new value to try for the change in the range-term, as part of the optimization process.
Once optimum payoffs have been determined for each of the range-term nodes, their respective payoff values are then propagated up the game tree 300 to the root node 301. The value for each non-terminal node during the propagation is that of the child with the largest value for the party that can move to it. It is noted that, therefore, if a particular party is not making the next move from a particular non-terminal node, the value of that non-terminal node is not the child with the largest value for that party, but rather the value of the child which has the largest value for the other party, which is the one that can move to it.
Once all the payoff values have been so propagated, and the child nodes beneath the root node 301 are all assigned a payoff value, the child node of the root node 301 with the highest available payoff is usually selected as the best option to move, e.g., to make a counteroffer in the context of the example embodiment. It is noted that the move from the root node 301 is, in the context of the example embodiment, always a move made by the party evaluating the game tree at that particular time.
For example, in
In the evaluation in relation to range-term node 312, its payoff value to the party assessing the game tree, in the example embodiment the seller, is that of its child, one of nodes 314 to 318, which has the highest payoff for the buyer, since it is the buyer that makes the next move from node 312. Adversely, the payoff value to the seller of node 318, for example, is its child, nodes 321 to 322, with the highest payoff value to the seller, who is making the next move from node 318.
In the example embodiment, the evaluation involving a logical extension of the game tree under node 312 (dotted area 313) is repeated for different assumed initial changes in the range-term from node 312, until an optimum payoff for range-term node 312 has been identified using a chosen optimization process routine.
Various conventional optimization process routines can be used to calculate a payoff for a limited number of range term values. An optimal range term value may be estimated using linear or non-linear mathematical techniques. The payoff value for the optimal value of the range term can then be computed and that payoff value may be used for the marked range-term node. Thus, the assignment of a payoff value can occur in the evaluation process even though that portion of the tree has not been full expanded in the initial creation of the game tree 300. Rather, the evaluation involves a logical extension of the game tree under each range-term node for selected changes in the range-term, e.g., a finite extension, as part of the optimization process of the example embodiment.
One simple example optimization process routine involves each of the parties having specified minimum and maximum concessions for each continuous variable contributing to the range-term, in the example embodiment only one, the price. The optimization involves computing the payoff of a range-term node for each of the specified minimum and maximum concessions, and their mean value, then fitting a parabola through these three points. The fourth try is the concession corresponding to the maximum of the parabola if it lies on this interval.
Generally, ROM transfers data and instructions unidirectional to CPU 402, while RAM typically transfers data and instructions in a bi-directional manner. Both storage devices 404, 406 may comprise any suitable computer-readable media. A secondary storage medium 408, which is typically a mass memory device, is also coupled bi-directionally to CPU 402 and provides additional data storage capacity. The mass memory device 408 is a computer-readable medium that may be used to store programs including computer code, data, and the like. Mass memory device 408 is typically a storage medium utilizing a non-volatile memory such as a hard disk or a tape that is generally slower than primary storage devices 404, 406. Mass memory storage device 408 may take the form of a magnetic or paper tape reader or other known devices. The information retained within the mass memory device 408 may, in appropriate cases, be incorporated as part of RAM 406 as virtual memory.
CPU 402 also couples to input/output devices 410 that may include devices such as video monitors, track balls, mice, keyboards, microphones, touch-sensitive displays, transducer card readers, magnetic or paper tape readers, tablets, styluses, voice or handwriting recognizers, or other known input/output devices including other computers. Finally, CPU 402 optionally may be coupled to a computer or telecommunications network, e.g., an Internet network, or an intranet network, using a network connection as shown generally at 412. With such a network connection, CPU 402 may receive information from the network, or may output information to the network in the course of performing the processes and methods in accordance with the disclosure herein. Such information is often represented as a sequence of instructions to be executed using CPU 402. The information may be received from and sent to the network, for example, in the form of a computer data signal embodied in a carrier wave.
In one embodiment, sequences of instructions may be executed substantially simultaneously on multiple CPUs, as for example a CPU in communication across network connections. Specifically, the above-described method may be performed across a computer network. Additionally, one of skill in the art will recognize that the method may be recognized as sets of computer codes and that such computer codes can be stored in computer readable media such as RAM, ROM, hard discs, floppy discs, carrier waves or other storage devices or media.
In accordance with a embodiment, there is provided a system for automated negotiation with continuous game moves. The system includes at least one processor and at least one associated memory device storing instructions for causing the at least one processor to construct a game tree with a root node that represents a current state of a negotiation; evaluate the game tree; select a next offer based on an output of the evaluating the game tree; define a range-term based on the continuous game moves and for treating the range-term as a single continuous variable in the game tree; and halt expansion of a branch of the game tree at a range-term node for which only the range-term was changed in the move leading to said range-term node. The instructions include an optimization process routine for causing the processor to determine an optimum payoff value for each range-term node.he optimization process routine may further causes the at least one processor to select a probe discretization for a range-term; expand a game tree underneath a range-term node based on said probe discretization; compute a payoff for a range-term node based on a expanded game tree, and select a new probe discretization.
Selecting the probe discretization may depend on the selection of constraints in the continuous game moves.
In another embodiment, there is provides a computer program comprising program code instructing a computer to execute a procedure to perform a method for automated negotiation with continuous game moves. The method includes the steps of: a) constructing a game tree, wherein a root node of the game tree represents a current state of the negotiation; b) evaluating the game tree; and c) selecting a next offer based on step b). Step a) includes defining a range-term based on the continuous game moves and treating the range-term as a single continuous variable in the game tree; and halting expansion of a branch of the game tree at a range-term node for which only the range-term was changed in the move leading to said range-term node. Step b) includes applying an optimization process routine to determine an optimum payoff value for each range-term node.
It will be appreciated by the person skilled in the art that numerous modifications and/or variations may be made to the present invention as shown in the specific embodiments without departing from the spirit or scope of the present invention as broadly described. The present embodiments are, therefore, to be considered in all respects to be illustrative and not restrictive.