This application claims priority to Japanese Patent Application No. 2011-244834, filed Nov. 8, 2011, and all the benefits accruing therefrom under 35 U.S.C. §119, the contents of which in its entirety are herein incorporated by reference.
The present invention relates to analytical technology for time series data, and more particularly to selecting an optimal time lag and time window for each variable in a time series prediction problem.
Generally, a multidimensional time series prediction problem (including recovery problems and class identification problems) are problems that predict the value of the next time series in a target variable time series from D types of explanatory variables in a time series. As specific examples, there can be offered those which predict a stock price from various economic indices, those which predict climate change and weather from various meteorological data, and those which predict the failure of mechanical systems from various sensor data. When solving such a multidimensional time series prediction problem, it is necessary to set an optimal time lag and time window for each explanatory variable in the time series. On this point, time lag L refers to the time delay until a certain original explanatory variable has an impact on a target variable. In addition, time window W refers to the length of the period in which a certain original explanatory variable has an impact on a target variable. In an actual target system there exists a complex causality between an explanatory variable and a target variable. Specifically, and there exists an impact size, time delay (time lag), and impact width (time window) that differs according to the explanatory variable. For example, for the Japan Nikkei Average, the New York Dow has an immediate (short time lag) and sharp (short time window) impact, but a drop in domestic consumer sentiment has a delayed (long time lag) and protracted (long time window) impact.
With such a time series prediction problem, statistical approaches have conventionally been tested. In the field of statistics, there is a long history of research with AR (autoregressive) models in one-dimensional situations, and research on VAR (vector autoregressive) in multidimensional situations. However, in multidimensional situations, the method of examining the length of the model is central, and when exceeding several dimensions, there is a problem in that the reliability of the method greatly declines. Mechanical learning approaches have also been tested. In the field of mechanical learning, the main current is a sliding window method for considering the time lag and time window. In many situations, all of the explanatory variables are handled by identical time lag and time windows. The results are unsuitable in situations where there exist explanatory variables that apply a diversity of impacts (when the time lag and time window differ for each explanatory variable). In addition, one of either the or window is adjusted to reduce calculation volume, and this complicates discovery of an optimal combination. The following patent literature can be offered as literature on the subject.
In one embodiment, a method includes selecting, with a computer, a time lag that is the time delay until an explanatory variable time sequence applies an effect on a target variable time series, and a time window that is the time period for the explanatory variable time series to apply the impact on the target variable time series; converting, based upon the explanatory variable time series, to a cumulative time series structured by the cumulative values of each variable from each time point corresponding to a certain finite time; and solving the cumulative time series as an optimized problem introducing a regularization term, to obtain the value of the time lag and the value of the time window from the solved weight.
In another embodiment, a computer program product includes a computer readable storage medium having computer readable code stored thereon that, when executed by a computer, implement a method. The method includes selecting, with the computer, a time lag that is the time delay until an explanatory variable time sequence applies an effect on a target variable time series, and a time window that is the time period for the explanatory variable time series to apply the impact on the target variable time series; converting, based upon the explanatory variable time series, to a cumulative time series structured by the cumulative values of each variable from each time point corresponding to a certain finite time; and solving the cumulative time series as an optimized problem introducing a regularization term, to obtain the value of the time lag and the value of the time window from the solved weight.
In another embodiment a system includes a computer configured to select a time lag that is the time delay until an explanatory variable time sequence applies an effect on a target variable time series, and a time window that is the time period for the explanatory variable time series to apply the impact on the target variable time series. The computer is configured to convert, based upon the explanatory variable time series, to a cumulative time series structured by the cumulative values of each variable from each time point corresponding to a certain finite time; and solve the cumulative time series as an optimized problem introducing a regularization term, to obtain the value of the time lag and the value of the time window from the solved weight.
The statistical approach and the mechanical learning approach have been problematic for reliable and efficient handling of multidimensional time series prediction problems. Accordingly, the present invention embodiments provide a time series data analysis method, system and computer program that is capable of structuring a more accurate prediction model by reliably and efficiently seeking a time lag and time window that differs for each explanatory variable within a multidimensional time series prediction problem. Specifically, the invention embodiments include a method that selects a time lag that is a time delay until an explanatory variable time series applies an impact on a target variable time series, and selects a time window that is the time period during which the described explanatory variable time series applies an impact on the target variable time series, and provides changing, based on the explanatory variable time series, to a cumulative value time series to be structured by the cumulative values of the variables from each time point corresponding to a finite time, and solving the cumulative time series as an optimized problem that has introduced a regularization term, and to obtain the value of the time lag and the value of the time window from the obtained weight.
Advantageously, the embodiments provide an ability to reliably and efficiently seek a time lag and time window that differs by each explanatory variable in a multidimensional time series prediction problem.
The software configuration within computer 1 provides an operating system (OS) to provide basic functions, application software that utilizes the functions of the OS, and driver software for the input-output devices. Each of these software applications is loaded into RAM 12 along with each type of data, and is executed by CPU 11, and computer 1 executes the processing shown in
There is simultaneously selected an optimal lag and window by the introduction of regularization (S2). First, a prediction problem constituted of D(N+M) count cumulative value series explanatory variables and a single target function is returned to an optimization problem for the target function, and regularization term is introduced to the within the target function (S21). At this point, the result is making the weight of the explanatory variable in the regularization term approach zero (spacing) and stabilizing of model construction. With this implementation mode, there is introduced an L1 regularization term with large effect for making zero the weight of unneeded variables. Specifically, when x_i is made the explanatory variable vector, y_i is made the value of the target variable, and beta is made the model, the output of the model is f(x_i, beta), and the seeking of beta, for minimizing the following target functions, results in a return to the optimization model. This signifies the seeking of a model to minimize prediction error.
Sigma(y_i−f(x_i,beta))̂2
Then, by introducing a regularization term (the L1 regularization term, for example) in order to prevent complication of the model (in this case, increasing the nonzero component), there results the following target function. Furthermore, |beta| is the sum of the absolute values of each element of beta.
Sigma(y_i−f(x_i,beta))̂2+lambda |beta|
Subsequently, the complexity of the model to be obtained is regulated by regulating the regularization parameter (S22). At this point, it is expected that only the weights of the several count cumulative value series explanatory variables for the original explanatory variables required for prediction will become nonzero, and, comparatively, it is expected that all weights the original explanatory variables not needed for prediction will become zero.
Specifically, in the above equation, lambda is the regularization parameter, and by adjusting the size of the value (lambda>=0), there is ability to minimize the total of the prediction error combined with lambda*(the sum of nonzero elements of beta). It is generally known that, when lambda becomes greater, the prediction error rises while the sum of the nonzero elements of beta becomes smaller (reducing also the quantity and size of the nonzero elements).
Then, the complexity of the model is adjusted until the cumulative value series explanatory variables for which the weight is nonzero becomes two count (S23), and, by having made the cumulative value series explanatory variables for which the weight is nonzero into two count, there is ability to interpret this as simultaneously selecting an optimal L and W (S24). Furthermore, for convenience at this point, there is put forward the assumption that an optimal time window and time lag exists for all the explanatory variables, and that these can be expressed by the weights of two or more nonzero cumulative series explanatory variables. On the other hand, it is also assumed that there exist, in the time window and time lag of an actual model, noise variables that do not hold significance for prediction, and weights of these are all made 0. In this case, it is evident that at S23 of
There is simultaneously selected an optimal lag and window by the introduction of regularization (S2). First, a prediction problem constituted of D(N+M) count cumulative value series explanatory variables and a single target function is returned to an optimization problem for the target function, and regularization term is introduced to the within the target function (S21). At this point, the result is making the weight of the explanatory variable in the regularization term approach zero (spacing) and stabilizing of the model structure. With this implementation mode, there is introduced an L1 regularization term with large effect for making zero the weight of unneeded variables. Subsequently, the complexity of the model to be obtained is regulated by regulating the regularization parameter (S22). At this point, it is expected that only the weights of the several count cumulative value series explanatory variables for the original explanatory variables required for prediction will become nonzero, and, comparatively, it is expected that all weights the original explanatory variables not needed for prediction will become zero.
Furthermore, the complexity of the model is adjusted until the cumulative value series explanatory variables for which the weight is nonzero becomes two count (S23), and, by having made the cumulative value series explanatory variables for which the weight is nonzero into two count, there is ability to interpret this as simultaneously selecting an optimal L and W (S24).
Specifically, when there have been obtained cumulative series explanatory variables c_t̂g1 and c_t̂g2 (g1<g2) for which the weight is nonzero, there is optimal L=g1 and W=g1−g2 (refer to
c′
—
t={x_(t−5)+x_(t−6) + . . . x_(t−20)}−{x_(t−15)+x_(t−16)+ . . . x_(t−20)}
={x_(t−5)+x_(t−6)+ . . . +x_(t−14)}
This is equivalent to when lag L=5 and a window width W=10, and this enables interpreting that this combination is selected as the optimal set of values.
The following advantages can be offered by solving a multidimensional time series problem in this way. Specifically, in comparison to when simply combining both sides of differing time lags and differing time windows and preparing N*M types of conversion series for each explanatory variable (D*M*N variables), the calculation is made efficient and the model to be sought is made stable by the conversion series being completed with D(N+M) types. In addition, the expressive power becomes greater in comparison to when all explanatory variables are fixed at the same time lag and same time window, for reasons such as the variables becoming too many or the calculation becoming unstable, and there is expected the obtaining of a model with higher precision near that of a true model. In addition, there is enabled further mitigation by regularization of instability of the model calculation that remains by multicollinearity only by the cumulative series conversion. Moreover, by adjusting the effective condition of regularization with a regularization parameter, the weight of variables unneeded for prediction is suppressed and the proportion of nonzero elements in the weights of the cumulative series variables is adjusted, and this enables changing the complexity of the model to be expressed.
To this point, the situation of selecting a single lag and window width by the sliding window method has been considered, but, with more complex fluctuations of temporal impact, expression is possible by adjusting the regularization parameter (S22) in order to produce three or more count for the quantity of nonzero weights for the cumulative series variables for (S23). For example, the weight of c_t̂5 (gap g=5) becomes 2.0, the weight of c_t̂10 (gap g=10) becomes −1.0, the weight of c_t̂15 (gap g=15) becomes −1.0, resulting in N+M=20. By weighting and summing these cumulative series, there is obtained the following value c′_t.
c′
—
t=2*{x_(t−5)+ . . . x_(t−20)}
−{x_(t−10)+ . . . +x_(t−20)}
−{x_(t−15)+ . . . +x_(t−20)}
={x_(t−5)+ . . . +x_(t−9)}{x_(t−5)+ . . . +x_(t−14)}
=2*{x_(t−5)+ . . . x_(t−9)}+{x_(t−10)+ . . . +x_(t−14)}
This is equivalent to when lag L=5 and a window width W=10, and this enables interpreting that double the weights are attached for the window forward half in comparison to the window latter half.
The following section describes an example of an experiment that verifies the effect of this implementation mode, as illustrated in
Experiment Settings: The settings for the experiment were as follows.
x
—
a=sin(2x)+e
x
—
b=cos(x)+e
Wherein, (e˜N(0, 0.5̂2))
True Recovery Model: y=1.3*sw(x—a, 5, 2)−0.7*sw(x—b, 2, 8)+e
Function sw(x, 1, w): shift average for sliding window of Lag 1, Window w
Lag 1={0, 1, 2, 3, 4, 5}
Window Width w={0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10}
Existing Method:
Calculates conversion series for a combination of all candidate lag and window width
Applies LARS (least angle regression) for L1 regularized linear recovery
Proposal Method (Implementation Mode):
Calculates cumulative conversion series for maximum candidate lag plus maximum window width
Applies LARS (least angle regression) for L1 regularized linear recovery
Model Selection: Regularization parameter selects CP statistic minimum
Training Data: 50,000 samples
Compared coefficient weights of the true model and presumed model
Compared the prediction accuracy for the test data and the reduction effect for calculation time
While the disclosure has been described with reference to an exemplary embodiment or embodiments, it will be understood by those skilled in the art that various changes may be made and equivalents may be substituted for elements thereof without departing from the scope of the disclosure. In addition, many modifications may be made to adapt a particular situation or material to the teachings of the disclosure without departing from the essential scope thereof. Therefore, it is intended that the disclosure not be limited to the particular embodiment disclosed as the best mode contemplated for carrying out this disclosure, but that the disclosure will include all embodiments falling within the scope of the appended claims.
Number | Date | Country | Kind |
---|---|---|---|
2011-244834 | Nov 2011 | JP | national |