The present disclosure relates to positioning systems and in particular to improving accuracy of absolute and relative positioning between objects in a mobile environment.
Regrettably, more than 29,000 fatalities, 2.2M injuries, and $100 Billion dollars in financial losses occur annually on United States roads alone. There has been shared consensus among researchers and governments that those figures can be brought down by applying modern safety applications. The evolving technologies promise to make transportation safer than ever by embedding electronic safety features powered by wireless sensing in vehicles and roads. Location based features are at the heart of this evolution. In particular, identifying the exact position of a moving vehicle (object) is a key aspect to developing most of the vehicular safety features and commercial Location-Based-Applications (LBA).
Global Navigation Satellite Systems (GNSS) such as Geographical Positioning Systems (GPS) have been used for positioning objects with reasonable accuracy. GPS provides typically less than 70 cm accuracy and there have been several studies showing that GPS cannot be efficiently used in urban environment due to dilution of precision and the urban canyon phenomenon. Most safety applications require sub centimeter accuracy with high reliability in urban and sub-urban environments.
Undoubtedly, GPS is the most popular GNSS technology used for positioning objects up-to a normal accuracy of a few meters by timing the transmitted signal along a line-of-sight (LoS) between the satellite and the mobile earth object. If no clear LoS is available between the satellite and the mobile object, ranging to that satellite becomes impossible. The popularity of GPS led to increased interest in Location Based Systems (LBS) where applications behave differently based on user position. Serious and high-end LBS systems such as safety and mission-critical applications cannot tolerate limited positioning accuracy, limited signal availability in urban environment, cloudy/bad weather, or the lack of integrity indicators.
Accurate GNSS positioning (error <30 cm) requires the availability of multiple satellite signals (5+) which is impossible in urban and metropolitan areas. The transportation industry has an inevitable and imperative need to resolve the problem of inaccurate positioning in order to unlock the essential development of safety and automation applications. Unfortunately, GNSS and GPS systems exhibit the following inherent distinctive limitations:
a) Limited signal availability in urban environments, cloudy skies, or tunnels.
b) Insufficient accuracy to serve serious and high-end LBS systems like safety and mission-critical application.
c) Loss of precision until Time-To-First-Fix (TTFF) is made available. The TTFF is the time required for the receiver to acquire ephemeris as well as an almanac for all satellites that contains coarse orbit and status information for each satellite in the constellation. Fix times are unacceptably long, and fixes may never be reached when attenuation levels exceed 30 dB, which is likely in an urban environment and in bad weather.
d) Limited redundancy since GPS has no alternative system to be used in its absence.
e) Satellite-based systems are too centralized and lack the desired localized control.
Accordingly, there is a need for an accurate positioning system and method that provides improved positioning accuracy in a mobile environment.
Further features and advantages of the present disclosure will become apparent from the following detailed description, taken in combination with the appended drawings, in which:
It will be noted that throughout the appended drawings, like features are identified by like reference numerals.
Embodiments are described below, by way of example only, with reference to
In accordance with an aspect of the present disclosure there is provided a method for cooperative stochastic positioning in a mobile environment executed by a processor in an object of interest, the method comprising: receiving position announcements transmitted wirelessly from a plurality of objects, the position announcements providing the current position data of the respective object in relation to a common coordinate system; discretizing the received position announcements to obtain a plurality of data groupings based upon a clustering criteria applied to the received position data; performing clustering of the data groupings to determine which cluster datasets from the data groupings provide sufficient and consistent position accuracy to determine a relative position of the object of interest; applying stochastic automata model to selected cluster dataset to evaluate relative cluster weights in order to determine the relative position of the object of interest; and updating accuracy of a current position of the object of interest based upon determined relative position.
In accordance with another aspect of the present disclosure there is provided a system for cooperative stochastic positioning in a mobile environment the system comprising: a first receiver for receiving positioning announcements from a plurality of objects, the position announcements providing the current position data of the respective object in relation to a common coordinate system; a processor coupled to the first receiver; and a memory comprising instructions for execution by the processor, the instructions comprising: discretizing the received position announcements to obtain a plurality of data groupings based upon a clustering criteria applied to the received position data; performing clustering of the data groupings to determine which cluster datasets from the data groupings provide sufficient and consistent position accuracy to determine a relative position of the object of interest; applying stochastic automata model to selected cluster dataset to evaluate relative cluster weights in order to determine the relative position of an object of interest; and updating accuracy of a current position of the object of interest based upon determined relative position.
In accordance with yet another aspect of the present disclosure there is provided A computer readable memory providing instructions for performing cooperative stochastic positioning in a mobile environment, the when executed by a processor in an object of interest performing the method comprising: receiving wirelessly, position announcements transmitted from a plurality of objects, the position announcements providing the current position data of the respective object in relation to a common coordinate system; discretizing the received position announcements to obtain a plurality of data groupings based upon a clustering criteria applied to the received position data; performing clustering of the data groupings to determine which cluster datasets from the data groupings provide sufficient and consistent position accuracy to determine a relative position of the object of interest; applying stochastic automata model to selected cluster dataset position to evaluate relative cluster weights in order to determine the relative position of the object of interest; and updating accuracy of a current position of the object of interest based upon determined relative position.
The present disclosure provides a method and system for accurate positioning in a mobile environment using positioning information cooperatively from objects in the relative area. The disclosure is applicable to implementations of advanced safety vehicular applications but also equally applicable to any environment where objects are mobile and position data is required. In response to the immanent need for higher accuracy and integrity, a simple stochastic approach is provided that can improve and enhance vehicular or object positioning.
Suppose in an environment, objects are moving freely in any direction in space. At any point the position or location of all objects, as described in
Assume that many objects are announcing, via a wireless technology, their positions to the best of their knowledge to be Λ={Λ1, Λ2, . . . , Λn}. Then assume that the Object-of-Interest (OOI), as shown in
This would be quite easy if OOI can calculate the distance (distances between objects are typically called range) between itself and each of the objects announcing their positions. In other words, the set of ranges ={, , . . . , } define the range of OOI to each of the announcing objects OX1 to OX7. Therefore, if Λ and ρ are both available at time (t) (timing accuracy ζ), a simple multi-lateration would be possible and a simple Euclidean math would lead us to the solution. Even if the distances between all objects are too long, corrections to the Euclidean math are known to fix those errors. For clarity ρi is the projection of the range between OOI and object OXi on the set of chosen coordinates and can be written as i=(ρxi, ρyi, ρzi). Now a typical mathematical approach can be simplified as follows:
Knowing the sets:
Λ={Λ1,Λ2, . . . ,Λn} & ={1,2, . . . n}; (1)
The exact coordinates of OOI (ΛOOI=(xOOI, yOOI, zOOI)) can be obtained as
x
OOI
=f(Λ,) yOOI=f(Λ,) zOOI=f(Λ,) (2)
Unfortunately, a set of inherent practical limitations lead to the following:
The stochastic solution for positioning problem has the following characteristics:
3) The method design strongly guards against attacks; at least (½ n) must provide (consistent) malicious information before the OOI would show some signs of lower accuracy. Even in that case, it is fairly simple to predict malicious objects and isolate them. In other words, it is extremely difficult to mislead the method by faking invalid position messages.
As illustrated, the system and method provides higher accuracy by merely utilizing available information. It may require fairly large memory storage, but has been proven to show strong resilience to malicious attacks.
It would be ideal if the OOI receives position announcements from all (n) objects at discrete time points. Unfortunately, this is impossible. Objects are expected to announce their positions sporadically and randomly. Therefore, the method discretizes the available information at a time varying frequency. Since discretizing available information would involve extrapolation, it can affect accuracy. The effect on position accuracy can be accommodated by updating δ values. The method must also make a decision on the best time to recalculate position.
Assuming the Last-Known-Accurate-Position (LKAP) is Λ0=(x0, y0, z0), and given the position expectation at the next point Λ1=(x1, y1, z1), the OOI would discretize available information if the time since Λ0 exceeds mζ, or if the estimated errors in Λ1 contradicting available information exceeds acceptable tolerance. This system and method provides ways to mitigate the timing granularity effect mc and to decide when to re-evaluate the new position Λ.
Channel Propagation Model
Assume a wireless signal has N resolvable propagation paths between the transmitter and receiver. Since each reflector propagates multiple signals, and there are potentially have multiple reflected signals, the strongest reflected signal is selected. The multipath signal parameters used are: Angle of Departure (AoD φi), Angle of Arrival (AoA θi), and delay of Arrival (DoA τi). All of (φi, θi, and τi) can be measured using any available technique with respect to common bearing direction. Then, as illustrated in
From
Equation 3 applies for 1=1, . . . , N. Then assuming c is the, corrected, propagation speed, the Time Difference of Arrival (TDoA) can be determined by:
where ρi=ρ′i+ρ″i, and:
ρ′i=√{square root over ((xi−xo)2+(yi−yo)2)}{square root over ((xi−xo)2+(yi−yo)2)}
ρ″i=√{square root over ((xi−xs)2+(yi−ys)2)}{square root over ((xi−xs)2+(yi−ys)2)} (5)
Since the unknown position (xo, yo) needs to be obtained from the known position (xΛ yΛ), given the uncertainty in ({circumflex over (φ)}i, {circumflex over (θ)}i, and {circumflex over (τ)}i), The expected statistical error in measuring (φi, θi and τi) is applied such that:
{circumflex over (τ)}i=τi(xo,yo,xi,yi)+nτ
{circumflex over (θ)}i=θi(xo,yo,xi,yi)+nθ
{circumflex over (φ)}i=φi(xo,yo,xi,yi)+nφ
where (nφ
Therefore, when the number of paths N≧3, (3N−1) measurements and (2N+2) unknown parameters are determined. The problem yields a non-linear estimation problem that can be solved using machine learning or stochastic learning automata. Using a stochastic learning automata all nodes cooperate to arrive to better relative ranging. Therefore, the collective behavior of the selected set of nodes, or a cluster, are guaranteed to converge, since the accumulation of relative distances will naturally tend to marginalize low accuracy ranging in favor of the multiplicity of better accuracy ranging. Further, since the boundaries of the selected set of nodes (a cluster of nodes) are finite, the problem lends itself to deterministic finite automata. In the following section the formulation of the automata model is disclosed.
The approach used in this section defines a way to handle wireless channel propagation model that is efficient for many wireless technologies, but remain isolated from the stochastic method. Manufacturers might use alternate approaches to calculate the range between OOI and OX. The method that is illustrated here is given to show the use of non-Cartesian coordinates with the stochastic approach. Alternative approaches can be used for instance, using an infrared ranging method combined with the disclosed stochastic approach that utilizes the calculated ranges but not to the ranging method in itself.
Clustering
A soft classifier is utilized to split the set of input values A into competing sets or clusters. Each set of Λ would include closer consistency in its parameters. In other words, elements of a set Λa when combined together, they end up with a crisp value for ΛOOI that has the following features:
The method commences with the selection of clustering criteria (502) such as least-mean-square (LMS), arrival delay, data confidence or any other grouping criteria that can be utilized inferable from the received position data. The clustering criteria may also be associated with previously data received from an object to determine an accuracy of the data. In such a case a confidence interval may be assigned to position information or other characteristics to define accuracy of the data. Based upon the configuration of the system, the criteria may be fixed, for example during startup of the system or may provide multiple criteria options that are selected when a particular criteria for clustering does not provide accurate results and the method reinitiated, The cluster data set are selected (504) from the data grouping based upon the chosen criteria. The selected criteria is then used to minimize error in cluster accuracy (506) by switching one data position within the cluster at a time. This can be performed for example by, letting the given data set consist of N data points, denoted by the vectors:
Where i=1, 2, . . . , N. The AFCA classifies the data into R disjoint groups θ1, θ2, . . . , θR, according to a certain selected criterion. In order to elaborate this point, assuming the center of gravity of the total vectors being at the origin, i.e., equation 3 can be converted to:
Otherwise, a transformation from Xi to
Where the superscript t, indicates a transposed matrix, and the normalized intra-group scatter for class θj is:
Where Nj is the number of vectors in group θj, and:
Therefore, the average normalized intra-group scatter (508) for the total data set θ is:
Where pj=Nj/N and the normalized intra-group scatter (510) is given by:
Therefore, the following matrix identity is defined:
T=Γ+B (14)
Since different data criterion that can be expressed in terms of matrices T, Γ, and B, a chosen criteria to minimize is:
Where ∥ . . . ∥ is the Euclidean norm and tr Γ is the trace of Γ. It is noted that S0 is only invariant under an orthogonal transformation of the data space, and the AFCA method based on S0 is most appropriate for fairly concentrated clusters. The two clustering criteria which are invariant under any linear non-singular transformation are to maximize:
Where λk, k=1, 2, . . . p are the eigenvalues of the matrix Γ−1 B. However, the AFCA clustering algorithm is based on the three criteria and cannot have reasonable performance and computational complexity simultaneously. It is easy to show that under a normalizing transformation of the data space, any alternative criteria is suboptimal, i.e. when
X→Z=AX (18)
Where A is a linear non-singular transformation matrix such that
ATA
t
=I (19)
The elaborated criterion demonstrate that the criterion S′0, which is the criterion S0 after the normalizing transformation of equation 18 is applied, has the advantages of both optimum performance and low computational complexity compared to the criteria S0, S1, and S2.
For presentational purposes only, the least-mean-square (LMS) criterion is used. The main advantages of this criterion are that the method is invariant under any linear non-singular transformation and classification can be done by either a linear or a generalized linear machine. Yet it is important to indicate that the AFCA may use alternative S′0 criterion especially when clusters are not close to equi-probable like the case with the short-range signal. Now let:
Φ(X)=[Φ1(X),Φ2 (X), . . . ,Φq(X)]t (20)
Where Φi(X), i=1, 2, . . . , q, are linearly independent, real, single-valued and continuous functions of the components of X. The corresponding Euclidean space of Φ(X), is Eq. The LMS criterion is to minimize the quantity:
Where M is an (R−1)×q weight matrix, αj, j=1, 2, . . . R, are the reference points with the properties that:
And Ej is the expectation taken with the probability distribution of X's in group θj. The selected criteria can then be assessed (512) to determine if it should be tightened/loosened or if additional clustering criteria should be selected with the method restarting at (502) to select new clustering criteria, (204) in
The following sections describe the core stochastic method for calculating Λ.
Formulation of Automata Model
A classical deterministic finite automata problem is summarized, and then, the stochastic automaton is formulated that is considered in mobile environment situations. Alternative approaches can be used to calculate the range or classify calculated ranging information. Yet, the stochastic approach defined in this section and the following sections apply on how to calculate the high accuracy position of OOI. A zero sum is desired where all cooperative nodes (or participants) continuously reinforce lower square law. The next round of readings to restart the competition. Therefore, there will be a way to stop the competition or to indicate the point of departure.
The finite deterministic automaton is defined by a quintuple {Λ, , φ, g, h}, where Λ is the set of position inputs, is the set of estimated ranges, and φ is the set of confidence values where the higher the confidence the better. The set φ is treated as a discrete set of states by enforcing disconnected steps of 0.10, and therefore, maintains a finite state. The corresponding lowercase letters denotes members of the defined sets (e.g., λi, ρi, and φα). Now g is the output function, and therefore, ρ(t)=g[φ(t)]. Similarly, the next-state function φ(t+1)=h[φ(t), λ(t+1)]. The next equation is the canonical equation for finite automaton. Restricting the input set to two values, 0 and 1 respectively, called non-penalty and penalty where the penalty applies if the position exceeds certain threshold from the expected position. Therefore, I set λ1=0 and λ2=1. Finally, the random medium can be described by: C=C(p1, p2, . . . , pk), where:
p
α
=Pr[λ(t+1)=λ2|ρ(t)=ρα] (25)
Since the input to the automaton at instant (t+1) is the output of the medium at instant t. When pαs are constant, a stationary random state is defined. The state function h determines the learning behavior of the automaton. With suitable choice of h, the average penalty of the automaton decreases with time to an asymptotic value.
The stochastic automaton is defined by the sextuple {Λ, , φ, g, Π, T}. Similarly, Λ would be the set of two inputs (1 and 0:1 for penalty and 0 for non-penalty), is the set of estimated ρ ranges, φ is the set of states, g is the output function ρ(t)=g[φ(t)] which is one-to-one deterministic mapping from state set to the output set. Now, since the number of outputs equals the number of states, the vector Π is the state probability vector and Π(t)=(Π1(t), . . . ,Π1(t)) controlling the choice of the state and hence the output at instant t. Therefore, the state φi would be chosen at instant t with probability:
Finally, T defines the reinforcement scheme which drives Π(t+1) from Π(t). This can be simplified as:
Π(t+1)=T[λ(t),φ(t)]Π(t) (27)
Where T is not an explicit function of t, but t is the discrete-time parameter. Depending on the previous action and the environmental response to the action taken, the reinforcement scheme changes the probabilities controlling the choice of the state at the next-round. With suitably designed reinforcement scheme, the average penalty tends to decrease with time. This guarantees a conversion with relative ranging in a stationary state. However, in a dynamic environment, the timing may not be enough to reach sufficiently low average penalty. This can be mitigated by decreasing the time granularity to increase the number of rounds. Computational cost here is pretty minimal, and it is considered a good extension for this research.
In order to perform the linear and nonlinear reinforcement you need to consider the effect of environmental mobility on the performance of the proposed stochastic automaton. This resembles pretty much the operations of adaptive controller in a mobile environment. The stochastic state probabilities are continuously altered according to a reinforcement scheme in response to penalties received from the environment. The automaton adapts by reducing the average penalty. Periodic perturbations of penalty strengths are used as test-singles to drive analytic expressions describing the tracking behavior or the automaton operating under a linear reinforcement scheme. It can be proven that the parameters that determine the ability of a stochastic automaton to track the, perceived, mobility of the environment are the Eigen-values of the transition matrix R(α, β) as illustrated later.
Therefore, changes in the environment are reflected as perturbations in the asymptotic state probability values of the automaton. In this manner, the automaton is said to track the environment. However, as the perturbation frequency (ε) increases to such an extent that the automaton starts responding to the average value of perturbed parameter, then the automaton loses the ability to track. Hence, it is important to express the upper limit of the perturbation frequency (εu) which keeps the automaton tracking the environment mobility. These limits are expressed as functions of the Eigen-values of the transition matrix R(α, β). The Eigen-values also determine the correlation between any two state probability values separated by a number of transition intervals. The smaller the correlation, the higher is the upper limit (εu). For the purpose only on the square law reinforcement scheme is discussed which can be defined as follows. If φ(t)=φi, in other words if ρ(t)=ρi, then:
Using this reinforcement, the steady-state performance of the automaton can be summarized as follows:
Also the vector Π(i), which is defined by Πi(i)=1 and Πj(i)=0, where j≠i, would control the choice of states at the steady state. Therefore, the automaton actually chooses the probability of the state corresponding to the lowest penalty probability. This is promoted as the best possible performance. In this case, the automaton will be said to perform optimally, given that the state is assumed stationary at each value of t. If however, all pi>½, the stable state probability vector defining the choice of states asymptotically. Then, by reapplying Eigen-values on the transition matrix as performed before, and following the same approach to a great degree of accuracy by Π*, where:
Equation 31 is valid when the number of states is two. From the same equation it can be derived that the state corresponding to a lower penalty probability has higher probability of being chosen. This performance is referred to as expedient. In the case when more than one pi<½, it leads to a situation where suboptimal performance is possible and might converge, it can only happen if a major malicious attack has been attempted successfully on too many objects. However, it is highly unlikely that so many objects can be led to invalid pi unless the entire system stops for a long period that is much higher than (εu). Since this situation is practically unachievable, Π* will always converge.
Exploration of a Scenario
Now to run through a case showing how the competition process works, some definitions are introduced. For instance, the penalty of the preceding round can be called unit-loss while the non-penalty would be called unit-win. The ith output is identified and associated with the ith estimated range, therefore, pi is the probability of unit-loss based on range ρi, and qi=1−pi is the probability of unit win for range ρi. In that sense, the winning node would define the correct absolute-coordinates. Further, winning ranges would define the second best set of relatively well-positioned nodes. Therefore, relatively well-positioned nodes can adjust their position estimate based on the winning node absolute-coordinates and winning ranges by using multi-lateration. Finally, badly positioned nodes are those nodes with subsequent unit-loss. They can fix their coordinates based on winning ranges from either the winning node or the well positioned nodes.
Let us consider two automata, Σ1 and Σ2 taking part in the competition process. The input λj(t) of the automaton Σj has two values λ1j=0, i.e. a win, and λ2j=1, i.e., a loss. If ρj is the number of states for Σj, then, for the stochastic automata defined in the previous section, the number of ranges is also ρj, (i.e. ranges are ρ1(j), ρ2(j), . . . , ρrj(j). Hence, if (t) at time t is the set {ρ(1)(t), ρ(2)(t)}. The outcome of the round at time t would be the set Λ(t+1)={λ(1)(t+1), λ(2)(t+1)}. A competition Γ with automata Σ1 and Σ2 is defined if, for all sets (t), the probabilities P[(t), Λ(t+1)] of its outcomes Λ(t+1) are given. In the usual meaning of the competition theory, the payoff functions Mj(), j=1, 2, defining a competition Γ*, denote the expectation of winnings for the jth node, the set of ranges being denoted by F, and are related to P(, Λ) in the following way:
In a zero sum situation:
If at time t both automata Σ1 and Σ2 are in states φα(1) and φβ(2), the system is in state (α, β). If the R(α, β) represents the final probability of the system state being (α, β), then Wi, the expected value of the winning of Σj is given by:
This can be called the value of the two-node competition; a zero sum. The zero sums, for any round where =(ρ(1),ρ(2)); is:
P(f(1),f(2);1,1)=P(f(1),f(2);0,0)=0 (35)
But if by pij I define the probability that Σ1 loses when the range (ρi(1), ρj(2)) is employed and by qij=1−pij the probability that Σ1 wins for the same range, then, for this set of ranges, the expectation of winning gij for Σ1 is:
g
ij
q
ij
−p
ij (36)
And hence, the rectangular matrix [gij], i=1, . . . , ρ1; and j=1, . . . , ρ2; is the competition matrix.
Consider the competition between Σ1 and another cluster of nodes that has a fixed range mix (note that the second cluster resembles the set of stationary roadside units). If the second cluster employs another learning automaton, it is fixed range mix would vary slightly over time as its reinforcement scheme alters the state probability. Also note that the alteration would reverse, and then keep oscillating, the reason is that the technique enforces discrete disconnected steps on φ to maintain the finite state of the system.) Let it use the jth range with probability:
The mj do not depend upon the range chosen by Σ1. If at the instance t the automaton Σ1 chooses ρi(1), then the probabilities qi of a win by Σ1 and pi of a loss by Σ1 are given by:
Hence, the expected winning for the ith range can be written as:
g
i
=q
i
−p
i=1−2pi (39)
Which is identical to Σ1 being placed in the random medium:
C=C(p1, . . . ,pr1).
Since Σ1 is a finite deterministic automaton, linear resolutions may apply and can be easily driven. The following result can be shown to be valid. This automaton has ρ1 valid ranges corresponding to ρ1 possible branching of the state transition diagram and n states in each branch. If one or more gi≧0, W would represent the asymptotic expected winning of Σ1 when n→∞ as illustrated here:
W=max(g1, . . . ,gr
If all other nodes obtained optimal ranging estimate, then the value W would be
In such a case, the competition process is deterministic and falls into the same situation of fixed nodes. The cluster of nodes would keep oscillating due to the technique that enforces discrete disconnected steps, which are imposed to maintain the finite state of the system. But that wouldn't hurt the optimality as the competition would oscillate around the optimum, with the small margin selected to convert φ to a discrete set. The smaller steps chosen for φ, the closer (but slower) to optimality. Similarly, larger steps chosen for φ, the wider (but faster) from optimality. One can envision a mixed approach with varying steps, but communicating and synchronizing the change in φ steps would burden the system in practical sense, and would slow the optimization anyway.
Alternatively, if gi is negative, then:
In other words, W would equal to the harmonic mean of gi. And the performance would not be optimal following the sense of the competition theory. But equation 41 is valid consistently only when just one gi≧0. When more than one of gi≧0 exists; the probability of suboptimal performance increases consistently. Hence, equation 41 will never be reached, which counters the non-optimal argument.
Let us now compare this deterministic automaton's performance with that of a stochastic automaton described earlier. Only the square law reinforcement following the reinforcement scheme described is considered. Then, I let Σ1 have ρ1 states and ρ1 ranges. In state φi, Σ1 uses the pure range ρi. On the occurrence of this range, the competition environment (i.e. other nodes) uses the jth range with probability mj. By following the same arguments given previously, one obtains from the competition matrix G the unit loss probability pi and the unit win probability qi. The square law automation performs optimally if any po<½, in other words, if Π*i=1 and Π*j=0, where j≠i, and the asterisk characterizing the final values. The condition pi<½ means that gi>0. Under such a condition:
W=max(g1, . . . ,gr
Just as described before in equation 40. Therefore, this stochastic finite state automaton has a performance equivalent to the deterministic automaton only when the later has an infinitely large number of states. Again, if the mj are such that they constitute an optimal ranging (using the same multi-lateration approach), W becomes the winning value of the competition as in equation 41.
When the condition gi>0 is not satisfied for both the deterministic and stochastic automata, they play for the harmonic mean of the gi. This follows from equation 41 for case of stochastic automaton. In this case, W ends up between the upper and lower values of the competition,
Next two automata zero sum competitions can be considered, that can prove that for deterministic automata of, somewhat, different construction from the linear one and infinitely large number of states. The statement here is modified to exclude the possibility of sub-optimality which may or may not be taken into account. Further proofs of sub-optimality in this type of deterministic automata have been demonstrated by computer experiments and simulations. Sub-optimality can be measured by checking if the following conditions apply:
In exploring competitions between two stochastic automata, Σ1 and Σ2, each with square law reinforcement. Let the state probabilities of Σ1 be denoted by Πi(1), i=1, . . . , ρ1, and Πj(2), j=1, . . . , ρ2. Final values will be indicated by asterisk. If the probability of unit loss for Σα is denoted by pi(α), with a range ρi(α), then:
Where pij=Pr (unit win for Σ1 with F={ρi(1),ρj(2))}). And similarly:
Like before, the competition matrix G=[gij] can be used to obtain:
From equations 45 to 47
Assume that there is a final stochastically stable condition and that the automata have settled in that condition. Now, to determine under what condition pi(1)*<½, which is the condition for optimality in the square law nonlinear reinforcement, from equation48, an equivalent condition is obtained:
Furthermore, it is required that Σ1 to be optimal for all values of Π(2)*, i.e. whatever be the ranges chosen by Σ2. Then equation 49 becomes:
g
ij>0, j=1, 2, . . . , r2 (50)
Once equation 50 is satisfied, it automatically results in Π(1)* no having the ith component unity and the others zero. That's to say Σ1 is optimal irrespective of the selected ranges in Σ2. Equation 50 is equivalent to the existence of an all positive row in the matrix [gij]. Since [gij] corresponds to the winnings of Σ1, modification of equation 50 in order to make Σ2 optimal results in the condition:
g
ij<0, i=1,2, . . . ,r1 (51)
This is equivalent to a column with all negative elements. Further, suppose that equation 51 is satisfied for some value of i. Then, equation 50 cannot be satisfied, i.e., both Σ1 and Σ2 cannot be simultaneously asymptotically optimal. Now if Σ1 is optimal, then Σ2 has a final state probability distribution given to a great degree of accuracy by:
Then, by reapplying the optimality condition of Σ1 from equation 46:
Therefore, equation 52 can be rewritten as:
Equation 54 can be used to drive the average winnings of Σ1 as follows:
Now interchanging the two automata Σ1 and Σ2, one can easily establish that:
if gij<0, i=1,2, . . . r1 (56)
A view on how the entire process is performed is elaborated in
Specific embodiments have been shown and described herein. However, modifications and variations may occur to those skilled in the art. All such modifications and variations are believed to be within the scope and sphere of the present disclosure.
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/CA10/01165 | 7/27/2010 | WO | 00 | 3/27/2013 |