The present invention relates to a method and a program for detecting a change-point of time-series data, and a method and a program for predicting a probability density distribution of future time-series data values.
Various forms of time-series data are used today in a variety of fields. For example, fluctuations in foreign exchange are used in international finance and stock prices and the like are used in securities trading as time-series data. In addition, the management or the like of yield ratio trends is performed in, for example, semiconductor manufacturing and other manufacturing industries.
Such time-series data are generally used to detect a change-point in a trend and to take action and counter-measures in accordance with the change. For example, in transactions in yen/dollar currency exchange, there is a need to detect trends in price movements, in particular the point at which a trend has a major change, to appropriately correct trade policies. With such time-series data, the estimation of long-term trends is done using the moving average of past time-series data. A familiar example is yen/dollar trading, where exchange charts display moving averages over a 25-day period, a 13-week period, or the like. A technique for applying a random walk focusing on the moving average is broadly used in analyzing long-term trends of such time-series data.
However, in currency trading and the like, there are some cases where sharp price movements over the short term due to the occurrence of an event such as when there is a concentration of buy orders or sell orders linked to price movements. In such a case, it is difficult to detect such steep changes in trend with only a random walk that focuses merely on the moving average of past data. In other words, a moving average being an average for plural past data, it is therefore impossible, when there is less of the most recent data compared to the parent set, to perceive steep changes in trends, even when applying a random walk and even though the most recent data shows a major fluctuation.
To solve the problem described above, the present inventors have proposed the Potentials of Unbalanced Complex Kinetics (PUCK) model for detecting steep changes in time-series data (Non-patent Document 1). In the PUCK model, changes in time-series data in finance or other fields can be represented by the following Equation (1) with three elements: the true market value P(t) at a time (t) (the “noise-reduced price”), the observation error of the market price fY(t) at the time (t) (the “observation noise”), and the observed market price Y(t) at time (t) (“best-bid,” “best-ask,” “mid-quote data,” and the like).
[Equation 1]
Y(t)=P(t)+ƒY(t) (1)
A “tick” refers to an event where a trading market experiences a fluctuation in the price of a good to be traded due to a contract being approved. Accordingly, a single-tick period signifies the time interval between the timing of a contract at a certain point in time and the timing of the preceding contract. The number of ticks signifies the number of contracts approved within a certain time period, i.e., the number of events that occur resulting in a fluctuation in the price of the good to be traded. Below, “(t)” represents a point in time when counts by the number of ticks, or represents real-time. The PUCK model deals in fluctuations of the true market price P(t) after the reverse correlation of the displacement of the scale of the number of ticks has been removed. A fluctuation in the true market price during a single tick can be represented by the following Equation (2).
where the first term on the right-hand side of the Equation (2) represents the degree of contribution of the potential acting on a fluctuation in the true market price during a single tick, and fP(t) in the second term on the right-hand side of the Equation (2) represents a fluctuation error causing the true market price at time (t) to fluctuate. Further, b(t) is a potential coefficient at time (t), M is the number of most recent data needed to estimate the core of the price fluctuation, and PM(t) is a core price of the price fluctuation estimated using the number M of most recent true market prices. The potential coefficient b(t) at time (t) in the Equation (2) is empirically known to have a dependency relative to the number M of most recent data, and decreases in proportion to {1/(M−1)}. Accordingly, in the Equation (2), the potential coefficient b(t) at time (t) is multiplied by {1/(M−1)}, to avoid the dependency of the potential coefficient b(t) at time (t) on the number M of most recent data.
The core price PM(t) is represented by the following Equation (3).
When the Equation (2) is considered to be a random walk in a field where a linear central force that changes at each moment is working, the potential coefficient b(t) at time (t) is perceived as a secondary and higher-order potential coefficient, and a potential W(q, t) on the degree of contribution of the potential represented by the first term on the right-hand side of the Equation (2) can be expressed by the following Equation (4).
where i is the order of the potential.
The absolute value of the average rate of the price fluctuation is proportional to the magnitude of the slope of the tangent of the potential function. The expected direction of the price fluctuation is the direction down the slope. Thus, to understand where on the potential function the current market price is, it is necessary to calculate the divergence of the core price and the true market price and to understand where on the potential function the market price is.
Further, when the Equation (4) is used, the Equation (2) can be rewritten with the following Equation (5).
As described above, the degree of contribution of the potential W(q, t) in the Equations (4) and (5) is determined by the value of the potential coefficient bi(t) at time (t). For example, when, at time (t), the potential coefficient bi(t)=0 in each order at time (t), then P(t+1)−P(t)=fP(t), and only the fluctuation error component contributes to the price difference. On the other hand, as the absolute value of the potential coefficient bi(t) in each order at time (t) increases, the contribution of the potential W(q, t) increases. As a result, it is known that a diffusion that is faster or slower than a coefficient of diffusion calculated by the random walk model can be quantitatively estimated as a function of the potential coefficient bi(t) in each order at time (t) (below, this is also referred to as an anomalous diffusion). In particular, when the secondary potential coefficient b2(t)≦−2, the fluctuation of the market price is known to have already deviated and spread from the moving average of the past data.
The PUCK model is in this manner capable of expressing a short-term anomalous diffusion as described above by considering the secondary and higher-order potential term. It is thereby possible to detect an anomalous diffusion, such detection being unattainable by past techniques using only a random walk.
However, the present inventors have found that the above-described technique for using the PUCK model has the following problems. In today's financial markets and the like, with the introduction of high-speed automated trading using computer systems and the like, there is a need to quickly and accurately detect time-series data fluctuation indicating pricing. However, in an ordinary PUCK model, estimating the potential coefficient bi(t) in each order at time (t) requires that data for about 1,000 ticks be used to plot the values {P(t+1)−P(t)} and {P(t)−PM(t)} of the Equation (2) and derive a slope in scatter plot thereof using the least-square method or the like to calculate the potential coefficient b(t) at time (t) (hereinafter, a set of the potential coefficients bi(t) (2≦i≦K) in the respective orders is simply referred to as the potential coefficient bi(t) unless otherwise noted). For this reason, until time-series data for about 1,000 is collected after data sampling has begun, there is not yet enough, in practical use, to be able to track steep changes in trends without substituting past information.
Also, on the model, although the value of the potential coefficient b(t) at time (t) should not depend on the number M of most recent data, it is not possible to ignore the effect by which the appropriate range of the number M of most recent data for satisfying the Equation (2) changes from observation varies for each time period, and the estimation of the potential coefficient b(t) at time (t) has a large error, which is problematic. In light of this, it is a critical task to sequentially search for a number M of most recent data that is optimal for fitting the PUCK model.
Further, in order to estimate the optimal true price P(t) at time (t), it has been necessary to use all the data from approximately each day to separately update the calculation for the weight of the approximately optimal moving average of a few number of ticks.
The present invention has been made in light of the aforementioned situation, and it is an object of the present invention to provide a method and a program for detecting a change-point of time-series data, and a method and a program for predicting a probability density distribution of future time-series data values.
A first aspect of the present invention is a method for detecting a change-point in time-series data, applying a particle filter method to a PUCK model for calculating a true market price P(t) at a time (t) by the sum of a potential term and fluctuation error defined by the true market price at a time (t−1) and a core price, which is the moving average of a number M (where M is a positive integer) of true market prices until the time (t−1), comprising: a first step for obtaining a probability density function for parameters of a later time than a group of particles having parameters representing the state of the PUCK model each having different values; a second step for evaluating the degree of conformity of the true market price at the time (t) relative to the market price observed at the time (t) for each of plural particles; and a third step for resampling particles from the plural particles in accordance with the degree of conformity, wherein, in the third step, a random number is generated, and the random number is compared with a first predetermined value, wherein a probability density function comprising a probability density distribution where the parameters representing the state of the PUCK model at the time (t) are average values is generated as particles when the random number is greater than the first predetermined value, and a probability density function comprising a uniform distribution is generated as particles when the random number is less than the first predetermined value. Thereby, it is possible to detect a change-point by calculating the potential coefficient, which gives favorable trackability, and detecting changes in the potential coefficient, even when the market price, which is the time-series data, experiences a steep change in the trend.
A second aspect of the present invention is the method for detecting a change-point in time-series data, wherein the probability density distribution where the parameters representing the state of the PUCK model at the time (t) are average values is a normal distribution where the parameters representing the state of the PUCK model at the time (t) are average values. Thereby, it is possible to generate the probability density distribution precisely using the normal distribution.
A third aspect of the present invention is the method for detecting a change-point in time-series data, wherein the true market price at the time (t+1) is calculated by adding the potential term based on the PUCK model and a fluctuation error of the true market price at the time (t). Thereby, it is possible to appropriately resample using the particle filter method.
A fourth aspect of the present invention is the method for detecting a change-point in time-series data, wherein, in the third step, a random number is generated for each of the particles, and the random number is compared with the first predetermined value, where a conditional probability density function for assuming M at the time (t) is generated as particles when the random number is greater than the first predetermined value, and a probability density function comprising a uniform distribution is generated as particles when the random number is less than the first predetermined value. Thereby, it is possible to simultaneously apply the advancement of time not only to the potential coefficient but also to M, which indicates the number of most recent time-series data, seen from the time (t).
A fifth aspect of the present invention is the method for detecting a change-point in time-series data, wherein the true market price at the time (t+1) is calculated by adding the potential term and a fluctuation error of the true market price at the time (t). Thereby, it is possible to precisely estimate the true market price at the time (t).
A sixth aspect of the present invention is the method for detecting a change-point in time-series data, wherein the market price observed at the time (t) is the sum of the true market price at the time (t) and the observation error of the market price at the time (t). Thereby, it is possible to precisely estimate the degree of conformity of the true market price at the time (t) relative to the observed market price at the time (t).
A seventh aspect of the present invention is the method for detecting a change-point in time-series data, wherein the fluctuation error of the true market price at the time (t) and the observation error of the market price at the time (t) are given by a probability density function in accordance with a normal distribution or the like. Thereby, it is possible to precisely estimate the true market price at the time (t) and the degree of conformity of the true market price at the time (t) relative to the observed market price at the time (t).
An eighth aspect of the present invention is a method for predicting a probability density distribution of future time-series data values that calculates a true market price P(t) at the time (t) at a time (t+N) by time advancement of the true market price P(t) at the time (t) based on a potential coefficient of the potential term at the time (t) calculated by the method for detecting a change-point in time-series data according to one of Claims 1 to 7, the value of M and a fluctuation error of the true market price at the time (t). Thereby, it is possible to predict the probability density distribution of future time-series data values using the PUCK model.
A ninth aspect of the present invention is the method for predicting a probability density distribution of future time-series data values, wherein when a total number of particles is Np, a number specifying the particle is j (where j is an integer satisfying 1≦j≦Np), a potential coefficient of the particle with the number j at the time (t) is b(j)i(t) (where i is an integer of 2 or more indicating an order of a potential), a fluctuation error of the true market price of the particle with the number j at the time (t) is f(j)P(t), the true market price of the particle with the number j at the time (t) is P(j)(t), the true market price of the particle with the number j at the time (t+N) is P(j)(t+N), a core price of the market price of the particle with the number j at the time (t) is P(j)M(t), and the value of M at the time (t) is M(t), the true market price P(j)(t+N) of the particle with the number j at the time (t+N) is represented by Expression:
where K is an integer of 2 or more. Thereby, it is possible to precisely predict the probability density distribution of future time-series data values using the PUCK model.
A tenth aspect of the present invention is the method for predicting a probability density distribution of future time-series data values, wherein a potential coefficient b(j)i(t+1) of the particle with the number j at a time (t+1) is calculated by advancement of time of the potential coefficient b(j)i(t) of the particle with the number j at the time (t). Thereby, it is possible to precisely predict the probability density distribution of future time-series data values using the PUCK model.
An eleventh aspect of the present invention is a program for detecting a change-point in time series data, in which a computer is used to execute the detection of a change-point in time-series data in which a particle filter method is applied to a PUCK model for calculating a true market price P(t) at a time (t) by the sum of a potential term and fluctuation error defined by the true market price at a time (t−1) and a core price, which is the moving average of a number M (where M is a positive integer) of true market prices until the time (t−1), wherein a computer is used to execute a first processing for obtaining a probability density function of a parameter by generating a group of particles having parameters representing the state of the PUCK model each having different values, a second processing for evaluating the degree of conformity of the true market price at the time (t) relative to the market price observed at time (t) for each of plural particles in a degree of conformity evaluation unit, and a third processing for resampling particles from the plural particles in accordance with the degree of conformity in a sampling unit, wherein processing for regenerating a probability density function comprising a probability density distribution where the parameter representing the state of the PUCK model at the time (t) is a mean value as particles is executed when the random number is greater than the first predetermined value, and processing for generating particles in accordance with a uniform distribution is executed when the random number is less than the first predetermined value are executed. Thereby, it is possible to detect a change-point by calculating the potential coefficient, which gives favorable trackability, and detecting changes in the potential coefficient, even when the market price, which is the time-series data, experiences a steep change in the trend.
A twelfth aspect of the present invention is the program for detecting a change-point in time series data, wherein the probability density distribution where the parameters representing the state of the PUCK model at the time (t) are average values is a normal distribution where the parameters representing the state of the PUCK model at the time (t) are average values. Thereby, it is possible to generate the probability density distribution precisely using the normal distribution.
A thirteenth aspect of the present invention is the program for detecting a change-point in time-series data, wherein in the third step, not only are particles sampled more when having a greater degree of conformity, but also feasible particles having a low degree of conformity are also generated at a constant proportion. Thereby, it is possible to appropriately resample using the particle filter method.
A fourteenth aspect of the present invention is the program for detecting a change-point in time-series data, wherein in the first step, processing for generating a random number, processing for generating a conditional probability density function assuming a parameter of the PUCK model at the time (t) as particles, and processing for generating a probability density function comprising a uniform distribution as particles when the random number is less than the third predetermined value are executed. Thereby, it is possible to simultaneously apply the advancement of time not only to the potential coefficient but also to M, which indicates the amount of most recent time-series data, seen from the time (t).
A fifteenth aspect of the present invention is the program for detecting a change-point in time-series data, wherein in the program for detecting the change-point in the time-series data, the true market price at the time (t) is calculated by adding the potential term and the fluctuation error of the true market price at the time (t). Thereby, it is possible to precisely estimate the degree of conformity of the true market price at the time (t) relative to the observed market price at the time (t).
An sixteenth aspect of the present invention is the program for detecting a change-point in time-series data, wherein in the program for detecting the change-point in the time-series data, the market price observed at the time (t) is the sum of the true market price at the time (t) and the observation error of the market price at the time (t). Thereby, it is possible to precisely estimate the degree of conformity of the true market price at the time (t) relative to the observed market price at the time (t).
A seventeenth aspect of the present invention is the program for detecting a change-point in time-series data, wherein the fluctuation error of the true market price at the time (t) and the observation error of the market price at the time (t) are given by a probability density function in accordance with a normal distribution or the like. Thereby, it is possible to precisely estimate the true market price at the time (t) and the degree of conformity of the true market price at the time (t) relative to the observed market price at the time (t).
An eighteenth aspect of the present invention is a program for predicting a probability density distribution of future time-series data values, wherein a true market price P(t) at the time (t) at a time (t+N) is calculated by time advancement of the true market price P(t) at the time (t) based on a potential coefficient of the potential term at the time (t) calculated by the program for detecting a change-point in time-series data according to one of Claims 11 to 17, the value of M and a fluctuation error of the true market price at the time (t). Thereby, it is possible to predict the probability density distribution of future time-series data values using the PUCK model.
An nineteenth aspect of the present invention is the program for predicting a probability density distribution of future time-series data values, wherein when a total number of particles is Np, a number specifying the particle is j (where j is an integer satisfying 1≦j≦Np), a potential coefficient of the particle with the number j at the time (t) is b(j)i(t) (where i is an integer of 2 or more indicating an order of a potential), a fluctuation error of the true market price of the particle with the number j at the time (t) is f(j)P(t), the true market price of the particle with the number j at the time (t) is P(j)(t), the true market price of the particle with the number j at the time (t+N) is P(j)(t+N), a core price of the market price of the particle with the number j at the time (t) is P(j)M(t), and the value of M at the time (t) is M(t), the true market price P(j)(t+N) of the particle with the number j at the time (t+N) is represented by Expression:
where K is an integer of 2 or more. Thereby, it is possible to precisely predict the probability density distribution of future time-series data values using the PUCK model.
A twentieth aspect of the present invention is the program for predicting a probability density distribution of future time-series data values, wherein a potential coefficient b(j)j(t+1) of the particle with the number j at a time (t+1) is calculated by advancement of time of the potential coefficient b(j)i(t) of the particle with the number j at the time (t).
According to the present invention, it is possible to provide a method and a program for detecting a change-point of time-series data, and a method and a program for predicting a probability density distribution of future time-series data values that are capable of precisely detecting steep changes in the time-series data.
The following is a description of the embodiments of the present invention, with reference to the accompanying drawings. In each of the drawings, same elements are denoted by the same reference numerals, and a repeated description has been omitted when needed.
A description of the method for detecting a change-point of time-series data according to a first embodiment of the present invention will now be provided. The method for detecting a change-point of time series data in this embodiment applies a so-called particle filter method (Non-patent Document 3) to the above-described PUCK model. The application of the particle filter method makes it possible to sequentially predict the values at time (t) of the potential coefficient b(t) and the number M(t) of most recent data.
An overview of the particle filter method will be described first.
[Equation 8]
x
t
˜Q
t(·|xt-1,θ) (6)
Herein, “˜” is an operation for generating a probability variable xt in accordance with a probability density function Qt(·) that is dependent on the time (t), where xt-1 and θ are the parameters.
An observation vector yt is also given by the following Equation (7). Herein, Rt(·) is a probability density function that is dependent on the time (t), where xt-1 and θ are the parameters.
[Equation 9]
y
t
˜R
t(·|xt,θ) (7)
In the particle filter method the probability density function of the state vector xt is represented by the following Equation (8), as a conditional probability density function CPDF determined by the observation vectors {y1, y2, . . . , yt}.
[Equation 10]
CPDF=p(xt|y1, y2, . . . , yt) (8)
Herein, when there is an assumed number N (N is a positive integer) of particles (state vectors) in the particle filter method, then the conditional probability density function (PDF) of the N state vectors is represented by the following Equation (9).
Herein, x is a state vector at a time I estimated from an observation vector at a time k at the j-th (1≦j≦N) particle.
In the particle filter method, first an initial value at t=0 is given for the conditional probability density function (PDF) of the N particles (step S1 of
Then, at the start of the calculation using the particle filter method, the time (t) is set to t=1.
Next, the Equation (6) is used to generate, for the N particles, a probability density function (PDF) of the predicted particles at time (t) on the basis of the probability density function at time (t−1) (step S2 of
[Equation 13]
x
t|t-1
(j)
˜q
t(·|xt-1|t-1(j),θ)(i=j, . . . ,N) (11)
Herein, qt(·) is the probability density function defined by the Equation (9).
A weight coefficient Wj (the following Equation (12)) of each particle is then calculated by a likelihood Lj (the following Equation (13)). The likelihood Lj is obtained from the Equation (7). In other words, herein, the degree of conformance between the observation vector yt obtained from the actual time-series data and the state vector xt obtained by calculation is evaluated (step S3 of
Thereafter, the particles are resampled in accordance with the weight coefficient of each particle (step S4 of
Next, the sequence above is performed repeatedly, with the time (t) advancing by one at each step, until the final step (t=tend) (steps S5 and S6 of
The following is a description of the sequence in which the particle filter method is used for the PUCK model in this embodiment. This embodiment is characterized in dealing with the PUCK model when generating the probability density function in step S2 of
[Equation 16]
b
i(t−1)˜Qb(·|bi(t−2),θb) (14)
[Equation 17]
M(t−1)·QM(·|M(t−2),θM) (15)
By applying the Equation (15) to the previous Equation (3), the following Equation (16) can be obtained.
By applying the Equations (14) and (15) to the previous Equation (2), the following Equation (17) can be obtained.
In such a case as well, similar to the previous Equation (1), the Equation (18) holds true.
[Equation 20]
Y(t)=P(t)+ƒY(t) (18)
The observation error fp(t) of the true market price at the time (t) and the observation error fY(t) of the market price at time (t) follow the normal distribution and the like, as in the following Equation (19).
[Equation 21]
ƒP(t)˜N(0,σP2),ƒY(t)˜N(0,σY2) (19)
Below, in general, the portion of N(·) may be not only the normal distribution but also the probability distribution, which is further characterized by a higher-order moment or the like. Further, fP(t) and fy(t) may be different probability distributions.
A standard deviation σP of the observation error fP(t) of the true market price at time (t) can be represented by the following Equation (20), using a state vector.
[Equation 22]
σP,t˜QP(·|σP,t-1,θP) (20)
A standard deviation σY of the observation error fY(t) of the market price at time (t) can be represented by the following Equation (21), using a state vector.
[Equation 23]
σY,t˜QY(·|σY,t-1,θY) (21)
Using the Equations (20) and (21), the Equation (19) can be substituted with the following Equation (22).
[Equation 24]
ƒP(t)˜N(0,σP,t2),ƒY(t)˜N(0,σY,t2) (22)
Then, consideration is given to the advancement of time of the model described above. Below, a variable A(t) is given by the mean value A(t), the normal distribution of the diffusion dispersion σA2 or the like. Specifically, the variable A(t) can be expressed by the following Equation (23).
[Equation 25]
A(t)˜N(A(t−1),σA2) (23)
The variable A(t) is determined by the observation values {y(1), . . . , y(t−1)} prior to time (t). When the A(t) at this time is denoted by At|t-1, the Equation (23) can be substituted with the Equation (24).
[Equation 26]
A
t|t-1
˜N(At-1|t-1,σA2) (24)
However, the At|t-1 illustrated in Equation (24) and
[Equation 27]
A(t)˜(1−Amut)truncN(A(t−1),σA2;Amin,Amax)+AmutU(Amin,Amax) (25)
{(1−Amut)truncN(A(t−1), σA2; Amin, Amax)} of the Equation (25) is a normalized cut normal distribution or the like of the mean value A(t−1), the dispersion σA2, and the interval (Amin, Amax). truncN can be extended to one in which a general distribution characterized by the average and/or standard deviation or the like is cut by an upper limit and a lower limit. Amut is a coefficient that is newly introduced by the Equation (25) in order to support steep changes in market prices or other sudden events. Specifically, A(t) is represented by a form in which the two components of the normal distribution or the like and the uniform distribution U (Amin, Amax) are mixed. The mixture rate of the uniform distribution U (Amin, Amax) is determined by the value of Amut.
Subsequently, a description will be provided for a specific example for applying the Equation (25) to each parameter of the PUCK model. The processing for applying the Equation (25) to each parameter of the PUCK model is performed in the step S1 described above.
[Equation 28]
b
i(t)˜(1−bi,mut)N(bi(t−1),σb2)+bi,mutU(bi,min,bi,max) (26)
Using the Equation (26) and applying the particle filter method makes it possible to estimate the potential coefficient bi(t) in each order at time (t). Herein, it is necessary to determine the mixture ratio of the normal distribution or the like and the uniform distribution in the potential coefficient bi(t) in each order at time (t). Herein, a random number u is generated for such a purpose (step S11 of
That is, when u≧bi,mut, the potential coefficient b(t) at time (t) is established from only the normal distribution or the like (step S13 of
[Equation 29]
b
i,t|t-1
(j)˜truncN(bi,t-1|t-1(j),σb2,bi,min,bi,max) (27)
On the other hand, when u<bi,mut, the potential coefficient bi(t) in each order at time (t) is established from only the uniform distribution (step S14 of
[Equation 30]
b
i,t|t-1
(j)
˜U(bi,min,bi,max) (28)
In this embodiment, as described above, a uniform distribution is introduced in order to give consideration to a sudden event in which there is a steep change to the time-series data. In other words, in case of no uniform distribution, particles with a lower degree of conformity are culled and eliminated each time the particles advance a generation. When a sudden event occurs in such a state, particles that conform to the sudden event will have already been eliminated, and the sudden event cannot be tracked. However, in this embodiment, the introduction of a uniform distribution taking into account a sudden event makes it possible to prevent the elimination of particles with a low degree of conformity. It is thereby possible to retain particles with a high degree of conformity to the sudden event, and to increase the ability to track the sudden event compared to an ordinary particle filter method. In this manner, the introduction of the uniform distribution makes it possible to achieve a specific effect that cannot be achieved when an ordinary normal distribution or Lorentz distribution is used.
Next, a description will be provided for the number M(t) of most recent data. Similar to the potential coefficient b(t) at time (t), a uniform distribution is also introduced to the number M(t) of most recent data, in order to given consideration to a sudden event. In this embodiment, the number M(t) of most recent data is represented as a mixed distribution of a uniform distribution and a probability density function at time (t−1). The number M(t) of most recent data is processed in the step S1 described above.
[Equation 31]
M(t)˜(1−Mmut)G(M(t)|M(t−1),σM,Mmax,Mmin)+MmutU(Mmin,Mmax) (29)
Herein, similar to the case of the potential coefficient b(t) at time (t), a random number u is generated (step S21 of
That is, when u≧Mmut, the number M(t) of most recent data is established from only the probability density function at time (t−1) (step S23 of
[Equation 32]
M
t|t-1
(j)(t)˜G(|Mt-1|t-1(j),r,Mmax,Mmin)
M
t|t-1
(j)(t)˜U(Mmin,Mmax) (30)
where U(Mmin, Mmax) is a uniform probability density function such that the lower limit is an integer value Mmin, and the upper limit is an integer value Mmax.
On the other hand, when u<Mmut, the number M(t) of most recent data is established only from the uniform distribution (step S24 of
The following illustrates a specific example of the expected probability density function G(·). When Mmin<M(t−1)<Mmax, the probability density function G(·) is represented by the following Equation (31).
[Equation 33]
G(M(t−1)+1|M(t−1),r,Mmin,Mmax)=r
G(M(t−1)|M(t−1),r,Mmin,Mmax)=1−2r
G(M(t−1)−1|M(t−1),r,Mmin,Mmax)=r (31)
where 0≦r≦1.
In the boundary conditions, the probability density function G(·) is represented by the following Equation (32).
[Equation 34]
G(Mmax−1|Mmax,r,Mmin,Mmax)=r
G(Mmax|Mmax,r,Mmin,Mmax)=1−r
G(Mmin+1|Mmin,r,Mmin,Mmax)=r
G(Mmin|Mmin,r,Mmin,Mmax)=1−r (32)
For M(t−1) outside the range given above, the probability density function G(·) is represented by the following Equation (33).
[Equation 35]
G(·|M(t−1),r,Mmin,Mmax)=0 (33)
The standard deviation σP(t) of the fluctuation error fP(t) of the Equation (22) (the standard deviation σY(t) of the observation error fY(t) also has a similar sequence to the below) is given by the following equation.
[Equation 36]
σP(t)˜(1−σPmut)truncN(σP(t−1),γP2;σPmin,σPmax)+σPmutU(σPmin,σPmax) (34)
Herein, σpmut is a numerical value between 0 and 1, and σpmin and σpmax represent the upper and lower limits, respectively, of the range that σp(t) can take. γp2 is a numerical value for representing diffusion.
[Equation 37]
σY(t)˜(1−σYmut)truncN(σY(t−1),γY2;σYmin,σYmax)+σYmutU(σYmin,σYmax) (35)
Herein, σYmut is a numerical value between 0 and 1, and σYmin and σYmax represent the upper and lower limits, respectively, of the range that σY(t) can take. γY2 is a numerical value for representing diffusion.
As has been described above, according to the method for detecting a change-point in time-series data according to this embodiment, when the time-series data are in a stable trend, and the trend can be appropriately tracked by increasing the particles indicating parameters with a high degree of conformity to the stable trend (the potential coefficient b, the number M of most recent data, the standard deviations σP and σY).
Furthermore, according to the method for detecting a change-point in time-series data according to this embodiment, a trend can still be tracked even when there is a steep change, i.e., a sudden event deviating from the stable trend. That is, in the case of an ordinary particle filter method, when a stable trend persists, there is an increase only in the particles conforming to the stable trend, and particles that do not conform to the stable trend but rather to a sudden event are culled and eliminated.
However, in the method for detecting a change-point in time-series data according to this embodiment, at the calculation of the probability density function for the parameters at the most recent time, a uniform distribution is assigned at a constant proportion to each of the parameters, as illustrated in the Equations (28) and (30). A uniform distribution is thereby assigned at a constant proportion even to particles that are eliminated due to having a poor degree of conformity to the stable trend, when the probability density function is generated in step S2 of
Then, the sequential calculation of the potential coefficient bi(t) and the number M(t) of most recent data makes it possible to detect the state of changes in the time-series data. The changes in the time-series data are stable when a secondary potential coefficient b2(t) has a positive value and the other bi(t) is in the vicinity of 0. However, when there is an increase in the degree of conformity of particles where the secondary potential coefficient b2(t) takes a negative value as a result of the sequential analysis, the changes in the time-series data are then in an unstable state. In other words, monitoring the changes in the value of the secondary potential coefficient b2(t) makes it possible to detect the boundary points of whether the changes in the time-series data is stable or unstable. Further, it is predicted that the sharp downward trend is likely to occur when a third-order potential coefficient b3(t) has a positive value and the sharp upward trend is likely to occur when b3(t) has a negative value.
In addition, although the particle filter method expresses the potential coefficient bi(t) and the number M(t) of most recent data as plural particles, the particles having a high degree of conformity vary in accordance with the advancement of time, and therefore the values of the potential coefficient bi(t) and the number M(t) of most recent data having a high degree of conformity also vary. Because the number of most recent data having a high degree of conformity is variable, available data can be used to calculate the number M of most recent data having a high degree of conformity even when, for example, there is a small number of accumulated time-series data. In other words, although the ordinary PUCK model requires a certain number of data (on the order of 1,000 points), this method can be used with a smaller number of data. Furthermore, there is no need to calculate data noise-removal processing for the optimal moving average or the like, as was indispensable in the existing technique. Therefore, compared to the existing technique of the PUCK model for estimating the parameters, there is an advantage in that the PUCK model can be applied even at a stage where there is less data accumulated.
Moreover, referencing the value of the potential coefficient bi(t) and the degree of conformity, i.e., the number of particles makes it possible to analyze the fluctuating environment of the time-series data. Specifically, taking the example of yen/dollar trading, when there is a tendency for particles where the secondary potential coefficient b2(t) takes a positive value to have a high degree of conformity, the change in pricing is stable, and compared to the random walk, the pricing is more prone to be pulled back in the reverse direction in a short time scale. In such a case, the market is more likely to experience an inversion in the upward or downward pricing trend, and, as a result, less likely to experience a major pricing fluctuation. On the other hand, in a case in which there is a tendency for the particles where the secondary potential coefficient b2(t) takes a negative value to have a high degree of conformity, the change in pricing is unstable, and compared to the random walk, the pricing change is more likely to persistently occur in the same direction in a short time scale, and there is more likely to be an amplification of the upward or downward trend. In such a case, a trader having a strategy for tracking the trend can be expected to be numerically dominant in the market (Non-patent Document 2). Thus, applying this method to time-series data such as prices in a financial market like currency exchange makes it possible to quantitatively evaluate the characteristics of the collective behavior of traders in the market.
The above-described effect of being able to quantitatively evaluate the characteristics of the collective behavior of traders in the market is an effect specific to the method for detecting a change-point in time-series data according to this embodiment. In the method for detecting a change-point in time-series data according to this embodiment, the potential coefficient bi(t) at time (t) and the number M of most recent data, which have plural different values, can each be expressed as particles having different degrees of conformity. It is thereby possible to perform plural simultaneous evaluations for the potential coefficient bi(t) at time (t) and the number M(t) of most recent data, which have different values, presuming that, there being a high degree of conformity at time (t), the changes in time-series data are subject to a dominant effect. It is thereby possible to simultaneously evaluate the behavioral aspects (the potential coefficient bi(t) at time (t)) and the number of data (number M(t) of most recent data) considered for reference in order to determine the behavior of traders having different behavioral characteristics. By contrast, in the ordinary PUCK model to which the above-described particle filter method has not been applied, it is only possible to hypothesize the number M of most recent data and to calculate the potential coefficient bi(t) at time (t) for the hypothesized number M of most recent data. Therefore, in principle, it is not possible to simultaneously evaluate the potential coefficient bi(t) and the number M(t) of most recent data needed in order to estimate the core. In other words, it is not possible to estimate the probability density function and the like of the potential coefficient bi(t) at time (t), the number M(t) of most recent data, and other parameters of the PUCK model. Specifically, plural simultaneous evaluations of the potential coefficient bi(t) at time (t) and the number M of most recent data using the PUCK model can be achieved for the first time by the use of the method for detecting a change-point in time series data according to this embodiment.
The method for detecting a change-point in time-series data according to this embodiment can also be provided as a program in which the algorithms for expressing the steps S1 to S6 described above are recited. Executing such a program in a computer or other form of hardware allows for a similar effect as is obtained by the method for detecting a change-point in time-series data according to this embodiment, such as the detection of a change-point in time-series data as described above. For example, it is possible to execute the program, sequentially display the potential coefficients bi(t) at times (t) or the like on a display device, and observe in real-time the changes over time in the potential coefficient.
The following is a description of the program for detecting a change-point in time-series data according to the second embodiment.
The memory, unit 1 is constituted of a hard disk, DRAM, SRAM, flash memory, or other memory device, and stores past time-series data, information on the initial values set in step S1 of
A computation unit 2 reads in, from the memory unit 1 via the bus 4, time-series data, the information on the initial values set in step S1 of
Specifically, the initial setting unit 21 performs processing corresponding to step S1 of
The PDF generation unit 22 performs processing corresponding to the step S2 of
The degree of conformity evaluation unit 23 performs processing corresponding to step S3 of
The resampling unit 24 performs processing corresponding to step S4 of
The computation unit 2 outputs, via the bus 4 to the display unit 3, information on the weighting coefficient obtained by the resampling unit 24 (that is, the number of particles) and the potential coefficient bi(t) and the number M(t) of most recent data at time (t) represented by each particle.
The count unit 25 performs processing corresponding to steps S5 and S6 of
The display unit 3 displays the weighting coefficient outputted from the computation unit 2 (that is, the number of particles), information on the potential coefficient b(t) and the number M(t) of most recent data at time (t) represented by each particle, and the processing completion report, on a liquid crystal display screen for example, in a visible state. At such a time, the potential coefficients b(t) at time (t) and the like are sequentially displayed on the display device, and it is possible to observe in real-time the changes over time in the potential coefficient.
A description of a method for predicting a probability density distribution of values of future time-series data according to a third embodiment of the present invention will now be provided. As described in the method for detecting a change-point of time-series data according to the first embodiment, with use of the PUCK model and the particle filter method, the potential coefficient at time (t) can be obtained. Further, with the advancement of time of the PUCK model, values of the future market price can be obtained as a probability density distribution.
The market price (t+1) of the j-th particle at time (t+1) can be represented by the following Equation (36) using the Equation (1).
Next, the true market price P(t) at time (t) calculated by the method for detecting a change-point of time-series data according to the first embodiment is prepared (step S32).
Using the prepared parameters and the true market price P(j)(t) at time (t), the Equation (36) is calculated.
Then, it is determined whether the time (t) of the most recent processing reaches t=t+N (step S34). When the time of the most recent processing is not the final step (t—t+N), the time (t) advances by one step (step S35) and the calculation of the Equation (36) is performed again. On the other hand, when the time of the most recent processing is the final step (t=+N), the processing is terminated.
It is thereby possible to advance time to time (t+N) in the Equation (36) using the parameters and the true market price at time (t). With a parameter set (the potential coefficient b(j)i(t), the number of data M(j)(t) needed to estimate the core of the price fluctuation, and a fluctuation error f(j)p(t)) of the PUCK model of the j-th particle at time (t) estimated using the particle filter method, the predicted market price at time (t+N) when time advances to time (t+N) in the Equation (36) is represented by the following Equation (37).
As described above, by applying the Equation (37) to each particle, it is possible to obtain the probability density distribution of the market price P(j)(t+N) at time (t+N).
As described above, in the method for predicting the probability density distribution of values of future time-series data according to this embodiment, it is possible to reflect the influence of a nonlinear behavior of a price fluctuation by introduction of the third- and higher-order potential. It is thus possible to generate the left-right asymmetrical distorted probability density distribution as illustrated in
As described above, in the method for predicting the probability density distribution of values of future time-series data according to this embodiment, it is possible to obtain the probability density distribution of values of the future market price on the basis of the parameter set of the PUCK model at time (t) estimated using the particle filter method. Further, because the third- and higher-order potential can be introduced when k≧3 in the Equation (37), prediction including a nonlinear behavior of a market price fluctuation can be made. It is thus possible to predict the market price that more accurately reflects the real market conditions.
By applying the method for predicting the probability density distribution of values of future time-series data according to this embodiment to price prediction for currency and stock exchange and the like, it is possible to estimate the risk of price fluctuations on currency and stock exchange with higher accuracy than before. The method can be thereby used in design and development of financial products such as options where the risk is estimated more appropriately compared to the prior art methods.
A description of the program for predicting a probability density distribution of values of future time-series data according to a fourth embodiment of the present invention will now be provided.
The computation unit 2a reads in, from the memory unit 1 via the bus 4, the parameters and time-series data calculated by the program 20, and the program 40. The computation unit 2 executes the read-in program 40 and performs prediction of the probability density distribution of values of future time-series data. The program 40 includes a parameter reading unit 41, a data reading unit 42, a calculating unit 43 and a count unit 44.
Specifically, the parameter reading unit 41 reads in the parameters (b(j)i(t), M(j)(t), f(j)P(t)) of the j-h particle at time (calculated by the program 20. In other words, the parameter reading unit 41 performs processing corresponding to the step S31 of
The data reading unit 42 reads in the true market price P(j)(t) at time (t) calculated by the program 20. In other words, the data reading unit 42 performs processing corresponding to the step S32 of
The calculating unit 43 performs processing corresponding to the step S33 of
The count unit 44 performs processing corresponding to the steps S34 and S35 of
The display unit 3 displays the probability density distribution of the market price at time (t+N) output from the computation unit 2, on a liquid crystal display screen for example, in a visible state. At such a time, the probability density function of the market price from time (t) to time (t+N) may be sequentially displayed on the display device.
As described above, according to this embodiment, the predication program and the prediction device for predicting the probability density distribution of values of future time-series data according to the fourth embodiment can be implemented in a specific manner.
A description of another method for predicting a probability density distribution of values of future time-series data according to a fifth embodiment of the present invention will now be provided. In the third embodiment, it is described that values of the future market price can be obtained as the probability density distribution by the advancement of time of the PUCK model. Specifically, the advancement of time is performed using the potential coefficient b(t) at time (t), which is the starting point of the time advancement. Note that, in this embodiment, the order i of the potential and the number j of the particle are not illustrated.
In the third embodiment, the future is predicted on the assumption that the value of the potential coefficient b(t) at time (t) is the same during the period of the future time (t+1) to (t+N). Accordingly, in the third embodiment, it is only possible to predict the changes over time of short-term price fluctuations. On the other hand, if an equation for the time advancement of the potential coefficient b(t) can be obtained as described in this embodiment, the changes of the potential coefficient b(t from the future time (t+1) to (t+N) can be predicted. This enables the obtainment of the more accurate price distribution at future time (t+N). It is therefore important to estimate an equation for the time advancement of the potential coefficient b(t) when predicting the changes over time of long-term price fluctuations. For example, the relationship of the potential coefficient b(t+1) at time (t+1) with the potential coefficient b(t) at time (t) is defined by the following Equation (38).
where the function G(b) in the first term (partial differential term) on the right hand side is a function that describes the time advancement of the potential coefficient to time (t), and fb(t) is a noise term.
When the first term (partial differential term) on the right hand side excluding the symbol “-” is a potential λ acting on the changes over time of the potential coefficient, the Equation (38) can be transformed into the Equation (39).
[Equation 41]
b(t+1)−b(t)=−λb(t)+ƒb(t) (39)
From the Equation (39), the potential coefficient b(t+1) at time (t+1) is represented by the following Equation (40).
[Equation 42]
b(t+1)=(1−λ)b(t)+ƒb(t) (40)
As represented in the Equation (40), the coefficient (1−λ) acts on and the noise term fb(t) is added to the potential coefficient b(t+1) at time (t+1).
Prediction of the market price distribution in the case where the secondary potential is acting, for example, is described using the potential coefficient represented by the Equation (40). Fluctuations of the true market price during one tick when the secondary potential is acting can be represented by the following Equation (41) based on the Equations (2), (4) and (5).
In the step S51, the potential coefficient b(t+1) is calculated using the Equation (40). Next, in the step S52, the Equation (41) is calculated using the prepared parameters and the true market price P(j)(t) at time (t). The other steps are the same as those of
Because prediction of the long-term trend is given as the price market distribution, it is possible to estimate the risk of price fluctuations at a certain point in the future. Therefore, according to this embodiment, it is possible to estimate the risk of price fluctuations on currency and stock exchange with higher accuracy than the third embodiment. The method can be thereby used in design and development of financial products such as options where the risk is estimated more appropriately. Further, it is possible to improve an index of risk management such as Value at Risk used by many financial institutions to measure the risk of their assets.
Note that the advancement of time of the potential coefficient described in this embodiment is given by way of illustration only. Accordingly, the advancement of time of the potential coefficient is not limited to the example represented by the Equation (40).
A description of the program for predicting a probability density distribution of values of future time-series data according to a sixth embodiment of the present invention will now be provided.
The program 60 has a configuration in which the calculating unit 43 of the program 40 is replaced by a calculating unit 62, and further a potential calculating unit 61 is added. The potential calculating unit 61 performs processing corresponding to the step S51 in
As described above, according to this embodiment, the predication program and the prediction device for predicting the probability density distribution of values of future time-series data according to the fifth embodiment can be implemented in a specific manner.
The present invention is not to be limited to the above embodiments, and can be variously modified within a scope that does not depart from the gist thereof. For example, the present invention can be applied to time-series data involving the fluctuations arising from devices in the process of manufacturing semiconductors, whereby it is possible to measure in real-time whether the treatment process is proceeding stably, and to rapidly detect abnormalities when for any reason instability occurs. In such a case, for example, the market prices can be substituted with the management and measurement data outputted by a device, and a similar analysis can be performed.
In the embodiment described above, as illustrated in the Equation (4), the potential coefficient b(t) at time (t) is described as a coefficient of a secondary and higher-order potential W(q, t). For example, actual financial markets or the like are known to have even more sudden changes in trends that cannot be expressed by a secondary potential. Therefore, the potential U(q, t) at time (t) is not limited to being secondary, but rather can be introduced in the form of a function including a higher-order potential term, and the coefficient of the higher-order potential term can be estimated to follow the sudden changes in trends. In particular, the introduction of a tertiary potential makes it possible to track a boom or collapse in a financial market more rapidly than with a secondary potential. It is also possible to further analyze the directionality of such a boom (that is, upward trend) or collapse (that is, downward trend).
Also, estimating a simultaneous potential distribution or the like of b(t) and M(t) at time (t) using the secondary potential model makes it possible to estimate a different dealer strategy for plural time scales (a multi-scale PUCK model). Further, estimating a third- and higher-order potential coefficient allows for the visualization of strategies for dealers following trends and of collective behavior causing one-sided fluctuations in pricing (the prospect of a multi-scale PUCK model including a higher-order potential).
This application is based upon and claims the benefit of priority from Japanese patent application No. 2011-196512 filed on Sep. 8, 2011 and Japanese patent application No. 2012-50819 filed on Mar. 7, 2012, the disclosure of which is incorporated herein in its entirety by reference.
The present invention is applicable to analysis or prediction of fluctuations over time in the market price in exchange markets and stock markets, or analysis or prediction of fluctuations over time in other time-series data such as management or measurement data.
Number | Date | Country | Kind |
---|---|---|---|
2011-196512 | Sep 2011 | JP | national |
2012-050819 | Mar 2012 | JP | national |
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/JP2012/005697 | 9/7/2012 | WO | 00 | 3/7/2014 |