There is a desire to provide a way to determine relative skills of players of games such as computer games, chess, tennis, and any other type of game. This needs to be achieved in a manner whereby the indication of relative skill is as accurate as possible and also is understood and accepted by end users (i.e. game players). In addition, the relative skills need to be determined quickly even in the case of games involving many players and also in the case of many teams of players, each team having many members. This is particularly problematic because in these situations, computation complexity typically increases significantly. That is, there is a need to determine skills of players after very few game outcomes and/or with computational efficiency. Players can be human players or computer programs.
Bayesian statistical techniques have been used to determine indications of player relative skill. However, it is desired to improve the accuracy of these previous approaches whilst at the same time addressing the issues of computational complexity.
The embodiments of the invention described below are not limited to implementations which solve any or all of the disadvantages mentioned above.
The following presents a simplified summary of the disclosure in order to provide a basic understanding to the reader. This summary is not an extensive overview of the disclosure and it does not identify key/critical elements of the invention or delineate the scope of the invention. Its sole purpose is to present some concepts disclosed herein in a simplified form as a prelude to the more detailed description that is presented later.
A process for determining relative player skills and draw margins is described. Information about an outcome of a game between at least a first player opposing a second player is received. Also, for each player, skill statistics are received associated with a distribution representing belief about skill of that player. Draw margin statistics are received associated with a distribution representing belief about ability of that player to force a draw. An update process is performed to update the statistics on the basis of the received information about the game outcome. In an embodiment a Bayesian inference process is used during the update process which may take past and future player achievements into account.
Many of the attendant features will be more readily appreciated as the same becomes better understood by reference to the following detailed description considered in connection with the accompanying drawings.
The present description will be better understood from the following detailed description read in light of the accompanying drawings, wherein:
Like reference numerals are used to designate like parts in the accompanying drawings.
The detailed description provided below in connection with the appended drawings is intended as a description of the present examples and is not intended to represent the only forms in which the present example may be constructed or utilized. The description sets forth the functions of the example and the sequence of steps for constructing and operating the example. However, the same or equivalent functions and sequences may be accomplished by different examples.
As mentioned above, Bayesian statistical techniques have previously been used to determine indications of player relative skill. For example, as described in U.S. Pat. No. 7,050,868 entitled Bayesian Scoring and European patent application number EP06270014 which are both incorporated herein by reference in their entirety.
An example method for determining indications of player relative skill as described in U.S. Pat. No. 7,050,868 and EP06270014 is now summarized to aid in understanding embodiments of the present invention. At a high level, this method involves Bayesian inference techniques. Belief about a skill of each player of a game is modeled using a probability distribution of any suitable type which is described by statistics. For example, a Gaussian distribution is used which is uniquely described by its mean and standard deviation. The mean is used to model the average player skill belief and the standard deviation models the uncertainty associated with assessment of the player's skill.
As shown in
The system assumes that the accessed statistics may have changed slightly between the current and the last game played by each player. This is achieved by slightly increasing the skill uncertainty statistic by an amount which is a configurable parameter of the system. An update process is applied to update the statistics in the light of the game outcome information (block 102). This update process is described in more detail below. The accessed statistics are then replaced by the updated statistics (block 103) and the resulting ranked skills of the players may then be stored (block 104). That is, the updated skill belief statistics are stored and the previous values of those statistics may be discarded.
The game outcome information may comprise player performances and a configurable parameter, called a draw margin, is used to assess whether the player performances are close enough together for the game outcome to be considered a draw.
More detail about the update process is now given for the case of a two-player match.
The update process comprises determining the probability of the observed game outcome for given skills of the participating players and weighting it by the probability of the corresponding skill beliefs. This is done by averaging over all possible performances (weighted by their probability density values) and deriving the game outcome from the performances: The player with the highest performance is the winner; the player with the second highest performance is the first runner up, and so on. If two players' performances are very close together, then the system considers the outcome between these two players a draw. The larger the margin which defines a draw in a given league, the more likely a draw is to occur, according to the ranking system. The size of this margin is a configurable parameter of the ranking system and is adjusted based on the game mode. For example, a street race in a car racing game involving two players may almost never end in a draw if crossing the finishing line is measured with high accuracy (thus the parameter is set to almost zero). In contrast, a game between two opponents for points over a fixed period of time (like football) can easily end in a draw.
By virtue of the above weighting technique (which is based on Bayes' Law), the system arrives at a new skill belief for every player participating in the game. These skill beliefs are not Gaussian anymore. Hence, the ranking system determines the best Gaussian approximation. As a result, given players' μ values increase for each opponent they out-performed, and decreases for each opponent they lost against.
The simplest case for the ranking system update is a two-person match. Suppose we have players A(lice) and B(ob), with μ and σ values (μA,σA) and (μB,σB), respectively. Once the game has finished, the update algorithm determines the winner (Alice or Bob) and loser (Bob or Alice) and applies the following update equations:
μwinner←μwinner+σ2winner/c*v((μwinner−μloser)/c,ε/c)
μloser←μloser−σ2loser/c*v((μwinner−μloser)/c,ε/c)
σ2winner←σ2winner*[1−σ2winner/c2*w((μwinner−μloser)/c,ε/c)]
σ2loser←σ2loser*[1−σ2loser/c2*w(μwinner−μloser)/c,ε/c)]
c
2=2β2+σ2winner+σ2loser
In these equations, one unknown is β2 which is the variance of the performance around the skill of each player and this is typically configured as 0.5. Moreover, ε is the aforementioned draw margin which is a configurable parameter. The functions v(.,.) and w(.,.) are given by
if the game ends in win and loss or
if the game ends in a draw. Where the symbols N and Φ represent the density of the Gaussian distribution function and the cumulative distribution function of the Gaussian, respectively. The symbols t and α are simply arguments to the functions. Any suitable numerical or analytic methods can be used to evaluate these functions such as those described in Press et al., Numerical Recipes in C: the Art of Scientific Computing (2d. ed.), Cambridge, Cambridge University Press, ISBN-0-521-43108-5.
There are a few observations about these update equations:
In the case of a team match the team's skill is assumed to be a function of the skills of the players. In a preferred embodiment, this function is the sum. The algorithm determines the sum of the skills of the two teams and uses the above two equations where (μwinner,σ2winner) and (μloser,σ2loser) are the mean skills and skill variances of the winning and losing team, respectively.
The update equations for more than two teams require numerical integration. In this case the ranking system iterates two team update equations between all teams on neighbouring ranks, that is, the 1st versus the 2nd team, the 2nd team versus the 3rd team and so on. The computational complexity increases cubically for more than two teams as a result of the numerical integration required for the V and W functions. This is addressed by using factor graphs with message passing techniques to reduce the computation required in multi-team situations as now described with reference to
With reference to
If a player has played before and we have stored skill information for that player that information is accessed. In the case of a new player we use a default belief distribution with associated default statistics, for example an initial μ of 3 and σ of 1. Any suitable default belief distribution is used.
Information about the game outcome is obtained (see block 205 of
More detail about the process of forming the factor graph is now given with reference to
Each player is represented by a variable node for their skill connected to a set of nodes which relate to their skill and their performance in the particular game. In
As illustrated in
The factor nodes at the top of the diagram (row 306) are functions which access a database or other store to obtain belief distributions for each player (or use a default belief distribution in the case of a new player). These computational units feed the parameters describing the player skill belief distributions into the corresponding variable nodes. For example, in the case of Gaussian distributions there would be two parameters (two floating-point numbers) stored in each variable node. The next row of variable nodes, that is, the circular nodes 307 connected in series to the top leaf nodes, represent the player skills. These nodes each store the statistics describing the belief distribution for the associated player. The next row of factor nodes are computation units 308 which compute player performance on the basis of, in this example, player skill plus noise. That is, the skill belief distributions of units 307 are modified by increasing their variance parameters and the results are stored in the row of variable nodes 309 representing player performance. This is a deterministic computation, though it can be thought of as adding noise to the underlying random variables.
In order to obtain a representation of team performance as opposed to individual player performance the columns are combined as indicated in
In a preferred embodiment as illustrated in
Team performance differences are represented by nodes in row 312 and each is calculated as a difference between certain nodes in the team performance layer 311 as indicated. When the game outcome provides a total ordering of the teams, then differences are calculated between consecutive teams in the ordering. In the case of a draw between teams, the teams which drew are placed in an arbitrary order amongst themselves and differences are calculated between consecutive teams in the ordering. For example, in
The bottom nodes in the graph are factor nodes which represent a calculation process encouraging the team performance difference to be greater than the draw margin ε (if no draw) or less than the draw margin in absolute value (in case of a draw).
The process of message passing comprises carrying out a calculation associated with a computation node (square node in
The processing schedule is preferably divided into three phases: pre-processing, chain processing, and post-processing. An example pre-processing schedule is illustrated in
After one step of pre-processing, a chain processing schedule is iterated until the belief distributions stop changing, i.e., a suitable convergence criterion has been satisfied. An example chain schedule is indicated in
The general update equations for use in carrying out the computations along the arrows in the message passing process are now given. These general update equations are tailored for use with Gaussian distributions as shown.
Factor Node Update with Gaussian Messages
Consider the factor graph of
Suppose we would like to update the message mf→x and the marginal probability density px. Then, the general update equations are as follows:
where MM[·] returns the distribution in the Gaussian family with the same moments as the argument and all quantities on the right are normalized to be distributions. In the following we use the exponential representation of the Gaussian, that is,
G(x;τ,π)α exp(πx2−2τx)
This density has the following relation to the standard density
In the case of the exact factor nodes the update equations are given in the following table.
In the case of the order factor nodes, the update equations are given in the following table.
In the update equations set out in the tables above a represents weightings which in a preferred example are set to 1. Also, in the update equations v and w correspond to the functions v(.,.) and w(.,.) given by
if the game ends in win and loss or
if the game ends in a draw. They may be determined from the numerical approximation of a Gaussian and Gaussian cumulative distribution without using message passing.
In the example shown in
In the case of exact factor nodes, for message passing from a computation node (square node) to a single variable node (circular node) the update equations of the first row of the exact factor node update equation table is used. In the case of message passing from a computation node to two variable nodes the update equations of the second or third row of the table are used as appropriate. In the case of message passing from a computation node to three variable nodes the update equations of the fourth and fifth rows of that table are used as appropriate.
Taking into Account Past and Future Player Achievements
Most previous skill estimation systems operate in a filtering mode whereby they take into account only past game outcomes to estimate skill. This means that if player A beats an unknown player B and later it turns out that player B was in fact strong (e.g. by player B later repeatedly beating known-to-be-strong player C) these filtering-based methods are not able to retro-actively correct A's skill estimate upwards.
In many situations, information about both past and future player achievements is available. Embodiments of the present invention which use such information to improve accuracy of player skill estimates are now described. These embodiments may be thought of as comprising a smoothing of a time series of player skills which takes into account past as well as future player achievements.
For each time period of the time series, information is accessed about outcomes of games played in that time period (block 1401). For example, for year 1, outcomes of all games in which player A took part are accessed.
Within each time period a first update process is applied. This updates the skill belief statistics on the basis of the game outcome information (block 1402). For example, for year 1, the game outcomes for player A are used to update skill belief statistics for player A for that year. This is repeated for all players and for all years (or other time periods).
The process also comprises applying a second update process to update the skill belief statistics both forwards and backwards in the time series (block 1403). The first and second update processes may be integrated rather than carried out in series. The block diagram of
The resulting skill belief statistics are then stored (block 1404) and may be used as input to a matchmaking system, a system for displaying player rankings to end users or any other system which uses skill belief statistics.
It is also possible for the time series to comprise other statistics instead of or in addition to the mean and standard deviation values representing skill belief. Any suitable time series may be used having an ordered sequence of sets of statistic values taken at regular time intervals.
In some embodiments the update processes of
An example method of using a factor graph with message passing techniques to carry out the update processes, both within a time series interval or unit and forwards and backwards through the time series, is now described with reference to
Exemplary Method using Message Passing
For each player, a time series of skill belief statistics is accessed (block 1600) as mentioned above with reference to
More detail about the process of forming the factor graph is now given with reference to
The example factor graph of
For each player, a time series of skill belief statistics is represented by a row of variable nodes. For example, player 1 has a time series shown in row 1702, player 2 has a time series shown in row 1703 and player 3 has a time series shown in row 1704. In this example, a time series unit or interval is a year so that player skill statistics for year 1 are represented in the column labeled “year 1” and so on for years 2, 3 and 4.
For player 1, the time series of skill statistics from year 1 to year 4 is shown in row 1702. It comprises skill belief statistics stored in variable nodes S11, S12, S13, S14. Those variable nodes are linked together with a factor node 1711 between each variable node. The factor nodes 1711 act to add noise to the skill belief statistics reflecting the belief that skill in year t+1 is the same as skill in year t but corrupted by noise. The skill belief statistics for this player, for year 1, are initialized either to default values or using values read from a database as indicated by factor node 1712. For player 2, the time series of skill statistics is only available for years 1 through 3 as shown. For player 3, the time series of skill statistics is available only for years 2 through 4 as shown.
Within each time series unit (in this case, each year), observed game outcomes between players are represented in the factor graph. For example, in year 1 a two player match between player 1 and player 2 has been observed where player 1 was the winner. This is represented in the factor graph using the nodes indicated by dotted line 1713. Player 1's skill belief statistics for year 1 have noise added at factor node in row 1705 and the resulting performance statistics are stored at variable node P11 in row 1706. This is also done for player 2 to give variable node P12 at row 1706. The difference between these performance values is calculated at the factor node in row 1707 and the difference value stored at variable node d in row 1708. The bottom nodes (row 1709) in the factor graph are factor nodes which represent a calculation process encouraging the performance difference to be greater than 0 or less than 0 depending on which player was observed to be the winner.
In the example of
It is also possible to represent games between teams of players. For example, in year 3 a game between a team comprising players 2 and 3 against player 1 is shown. In this case, the team performance is calculated as the sum of the player performances for the team and stored at variable node 1710.
The process of message passing comprises carrying out a calculation associated with a computation node (square node in
For each player, the processing schedule begins starting at the factor nodes 1712 from which skill distributions are obtained either from a database or are set to default values. This is represented in
A chain processing schedule is then iterated until the belief distributions stop changing, i.e. a suitable convergence criterion has been satisfied. An example chain schedule is indicated in
In some embodiments, the upward messages used in the post-processing phase are stored. Repeated updates are then made on the same game outcomes for the particular year. However, the saved upward messages are used to calculate new downward messages in order to effectively divide out the earlier upward message to avoid double counting. The process is iterated up and down the columns stemming from the particular year variable nodes until the skill statistic values remain substantially the same.
The new player skills are stored whilst retaining a record of their previous values.
Processing then proceeds one more stage along the time series for each of players 1 and 2 by adding noise to reach variable nodes S12 and S22. The messages used to add the noise are also stored. Variable node S23 is also initialized as described above by reading statistics from a database or setting default values.
Computation follows down the columns stemming from the year 2 variable nodes and continues with pre-processing, chain processing and post processing as described above. Iteration up and down the columns may be carried out as described above until the player skill statistics for that year reach convergence. This gives new player skills S12, S22 and S32.
Again processing proceeds one more step along the time series for each of the players and then again within the columns stemming from the year 3 variable nodes. This process repeats until the end of each time series is reached.
Processing now makes its way backwards along each time series, again with processing within the columns stemming from each year at each time step in the series. However, it is necessary to avoid duplicating the effect of the previous steps in the processing chain. In order to do this, when the updates are made in the backwards direction along each time series, the effect of the previous forwards updates are removed. This is achieved using the records that were stored of the messages used and the variable values.
For example, in the backwards pass along each time series, the stored messages previously used in the forwards pass are used to calculate new messages for use in the backwards pass.
The procedure is iterated forward and backward along the time series of skill statistics for each player until the skill statistics do not change significantly. The backward passes make it possible to propagate information from the future into the past.
More detail about the processing stages is now given with reference to an example involving two player games. However, the methods described can equally be used for games with any number of teams having one or multiple players per team. A series of game outcomes between two players i and j in year t is denoted by yijt(k)ε{+1,−1,0} where kε{1, . . . ,Kijt} denotes the number of game outcomes available for that pair of players in that year. y=+1 if player i wins, y=−1 if player j wins and y=0 in case of a draw.
The update process for game outcomes within a time series step (in this case a year) involves going through the game outcomes yij t within a year t several times until convergence. The update for a game outcome yijt(k) is performed as described above with reference to
thus effectively dividing out the earlier upward message to avoid double counting. The integral above may be evaluated since the messages as well as the marginals p(sit)have been assumed Gaussian. The new downward message serves as the effective prior belief on the performance pit(k). At convergence, the dependency of the inferred skills on the order of game outcome vanishes.
The method also involves repeatedly smoothing forward and backward in time. During the first forward pass along the time series of each player the forward messages mf(s
This procedure is repeated forward and backward along the time series of skills until convergence. The backward passes make it possible to propagate information from the future into the past.
By making repeated updates on game outcomes within each time series step then the dependency of the inferred skills on the order of game outcomes is removed. This improves the accuracy of the skill estimates.
Also, by propagating information both forward and backwards along the time series accuracy of skill estimates is further improved. Previous systems have not been able to propagate information backwards in time.
The general update equations for use in carrying out the computations in the message passing process are as described above with reference to
In some embodiments average skill belief statistics per time series interval, over all players are determined. These average values may then be used to initialize skill belief statistics for previously unobserved players; that is, in place of the default statistic values mentioned above.
Previous player skill estimation systems have assumed a fixed draw probability per game mode or type and have required an operator or developer to pre-configure this parameter, also referred to as a draw margin. However, it is recognized herein that the probability of draw may be positively correlated with playing skill and that it may vary considerably across individual players.
A player skill estimation process is now described which determines a draw margin parameter value for each individual player in addition to player skill estimates.
As shown in
Game outcome information is accessed (block 1802) and used to update the statistics using a Bayesian inference process (block 1803). The updated statistics are stored (blocks 1804 and 1805) and may be used in any suitable application such as a matchmaking process, a player ranking process or any other suitable application.
The Bayesian inference process is carried out using similar methods to those described above and may either involve using factor graphs and message passing or may use any suitable numerical or analytic methods.
For example, suppose each player i at every time-step t is characterized by an unknown skill sitε and a player-specific draw margin εit>0. Performances pijk(k) and pjik(k) are drawn according to p(pijt(k)|sit)=N(pijt(k);sit,β2). In this model a game outcome yijt(k)between players i and j at time t is generated as follows:
Where y=+1 if player i wins, y=−1 if player j wins and y=0 in the case of a draw. A factorizing Gaussian distribution is assumed for the player-specific draw margins p(εi0)=N(εi0;v0,ζ02)and a Gaussian drift of draw margins between time steps given by p(εit|εit−1)=N(εt;εt−1;ζ2). A factorizing Gaussian prior p(si0)=N(si0;μ0,σ02) is assumed over skills and a Gaussian drift of skills between time steps given by p(sit|sit−1)=N(st;st−1; τ2).
In an example, the update process is carried out using factor graphs and message passing.
The factor graph of
Factor nodes 1901 are used to add noise to the skill statistics of the winner and loser to obtain performance statistics pW and pL as indicated. The difference between the performance of the winner and loser is forced to be greater than the draw margin of the loser plus a winning threshold uL of the loser. Message passing occurs in a similar manner to that described above with reference to
Factor nodes 2001 are used to add noise to the skill statistics of the players to obtain performance statistics pi and pj as indicated. The factor graph then encodes a situation where player i beats player j only if i's performance is bigger than j's performance by more than j's own ability at forcing a draw. Also, player j beats player i only if j's performance is bigger than i's performance by more than i's ability to force a draw. Message passing occurs in a similar manner to that described above with reference to
In some embodiments average draw margin belief statistics per time series interval, over all players are determined. These average values may then be used to initialize draw margin belief statistics for previously unobserved players; that is, in place of the default statistic values mentioned above.
It is also possible to combine the processes described above whereby individual player draw margin statistics are determined with the processes described above whereby the update process enables both past and future player achievements to be taken into account. For example, inference may be carried out both within a given year t (or other time series interval) as well as across years (or other time series interval) in a forwards and backwards manner.
The computing-based device 2100 comprises one or more inputs 2102 which are of any suitable type for receiving media content, Internet Protocol (IP) input, information about outcomes of games between players, skill statistics and draw margin statistics. The device also comprises communication interface 2109 for interfacing with a game matchmaking system, game leader board system or other system requiring player skill estimates and/or player draw margin estimates.
Computing-based device 2100 also comprises one or more processors 2101 which may be microprocessors, controllers or any other suitable type of processors for processing computing executable instructions to control the operation of the device in order to estimate skill statistics of players and/or draw margin statistics of players. Platform software comprising an operating system 2106 or any other suitable platform software may be provided at the computing-based device to enable application software 2107 to be executed on the device.
The computer executable instructions may be provided using any computer-readable media, such as memory 2105. The memory is of any suitable type such as random access memory (RAM), a disk storage device of any type such as a magnetic or optical storage device, a hard disk drive, or a CD, DVD or other disc drive. Flash memory, EPROM or EEPROM may also be used. A data store 2108 may also be provided for storing skill statistics and/or draw margin statistics.
An output 2103 is also provided such as an audio and/or video output to a display system integral with or in communication with the computing-based device. The output 2103 may also provide skill statistics and/or draw margin statistics in any suitable form such as by writing to a file, disk or other removable storage medium. A display interface 2104 may be provided to enable a graphical user interface, or other display interface although this is not essential.
The term ‘computer’ is used herein to refer to any device with processing capability such that it can execute instructions. Those skilled in the art will realize that such processing capabilities are incorporated into many different devices and therefore the term ‘computer’ includes PCs, servers, mobile telephones, personal digital assistants and many other devices.
The methods described herein may be performed by software in machine readable form on a storage medium. The software can be suitable for execution on a parallel processor or a serial processor such that the method steps may be carried out in any suitable order, or simultaneously.
This acknowledges that software can be a valuable, separately tradable commodity. It is intended to encompass software, which runs on or controls “dumb” or standard hardware, to carry out the desired functions. It is also intended to encompass software which “describes” or defines the configuration of hardware, such as HDL (hardware description language) software, as is used for designing silicon chips, or for configuring universal programmable chips, to carry out desired functions.
Those skilled in the art will realize that storage devices utilized to store program instructions can be distributed across a network. For example, a remote computer may store an example of the process described as software. A local or terminal computer may access the remote computer and download a part or all of the software to run the program. Alternatively, the local computer may download pieces of the software as needed, or execute some software instructions at the local terminal and some at the remote computer (or computer network). Those skilled in the art will also realize that by utilizing conventional techniques known to those skilled in the art that all, or a portion of the software instructions may be carried out by a dedicated circuit, such as a DSP, programmable logic array, or the like.
Any range or device value given herein may be extended or altered without losing the effect sought, as will be apparent to the skilled person.
It will be understood that the benefits and advantages described above may relate to one embodiment or may relate to several embodiments. It will further be understood that reference to ‘an’ item refers to one or more of those items.
The steps of the methods described herein may be carried out in any suitable order, or simultaneously where appropriate. Additionally, individual blocks may be deleted from any of the methods without departing from the spirit and scope of the subject matter described herein. Aspects of any of the examples described above may be combined with aspects of any of the other examples described to form further examples without losing the effect sought.
It will be understood that the above description of a preferred embodiment is given by way of example only and that various modifications may be made by those skilled in the art. The above specification, examples and data provide a complete description of the structure and use of exemplary embodiments of the invention. Although various embodiments of the invention have been described above with a certain degree of particularity, or with reference to one or more individual embodiments, those skilled in the art could make numerous alterations to the disclosed embodiments without departing from the spirit or scope of this invention.