The present invention relates to a data processing system and a data processing method, and is suitable for application to, for example, a data processing system and a data processing method for performing prediction using a prediction model.
In an energy business field including an electric power business, a gas business and the like, a communication business field, a transportation business field including taxi service, delivery service and the like, and the like, a future demand amount, a settlement price, and the like are predicted in order to perform facility operation, resource allocation, and the like in accordance with a demand of a consumer.
For example, in order to plan supply with respect to a demand for electric power which fluctuates by day or hour, prediction for a value (amount of electric power to be consumed) of the demand at a specified time such as one hour, two hours, three hours, the next day, one week, one month or one year in the future, and prediction for a value of an amount of electric power to be generated by a wind power generator, a solar power generator, or the like are performed.
In analysis and/or prediction of phenomena of energy such as electric power and gas, an error may occur. Therefore, an analysis limit is assumed, and reduction of an error in analysis and/or prediction is performed.
As a device that predicts a demand with a higher accuracy, a demand predicting device has been disclosed, including: a first prediction determination unit that determines first prediction data indicating a predicted value of a demand, based on forecast data containing a predicted value of predetermined information and based on first record data containing a result value of the demand; and a second prediction determination unit that, when the first prediction data satisfies a predetermined condition, determines second prediction data indicating another predicted value of the demand, based on the first record data and based on second record data containing a record value of the predetermined information (see JP-A-2019-117601).
Here, accuracy of prediction increases in the order of multiple regression prediction, Bayes optimal prediction using a decision tree model, prediction using a Gaussian process regression that reproduces a Gaussian process derived from a probability function. Further, as accuracy of a probability model which is incorporated increases, the accuracy of prediction increases.
However, in a case where prediction using Gaussian process regression is adopted in the demand predicting device described in PTL 1, since memory is consumed by the square of the number of samples N to derive a probability model, there is no choice but to shorten a sampling period and data of samples for a rare frequency event (temperature specific day, power generation plan stop, fuel transportation surplus, and the like) may be lost.
The invention has been made in view of the above circumstances, and an object of the invention is to propose a data processing system or the like that can appropriately determine data to be used for identification of a prediction model.
In order to solve such a problem, the invention provides a data processing system that performs prediction using a prediction model and that includes: a selection unit configured to select data to be used for identification of a prediction model from a storage unit that stores data; and a processing unit configured to identify the prediction model by using the data selected by the selection unit. The selection unit selects, from the storage unit, predetermined first data, and second data of a type and/or condition different from the first data, based on a branch condition of structure data of a structural prediction model.
With the above configuration, for example, the predetermined first data, and the second data of a type and/or condition different from the first data, are used for identification of the prediction model, and highly accurate prediction which incorporates a causal relationship that is lacking in the predetermined first data is realized. With the above configuration, since it is possible to avoid a situation where data of a rare frequency event is omitted from the data used for identification of the prediction model, it is possible to reduce consumption of the memory and improve accuracy of prediction, for example, by adopting a prediction model using the kernel function to shorten a sampling period.
According to the invention, it is possible to appropriately determine data used for identification of a prediction model.
Hereinafter, an embodiment of the invention will be described in detail with reference to the drawings. The present embodiment relates to a technique for predicting data. A configuration shown in the present embodiment is suitable for application to an operation support system for energy such as electric power, gas, or fuel.
For example, a system according to the present embodiment is a system that can analyze and/or predict a model (regression equation, self-regression equation, mapping, probability map) between data of prediction target and data of explanatory variable. More specifically, the system performs prediction, and includes a structure analysis unit that uses a structural prediction model to predict (classify), with an explanatory variable (or predictor, input data), data of a prediction target (or prediction output, prediction value, prediction data, output data), a first prediction unit that performs prediction based on a prediction model (explanatory variable and regression, or mathematical formula), and a determination unit that determines, based on output from the structure analysis unit, an index, such as a type of the explanatory variable and a period and location added to the explanatory variable, which are to be transferred to the first prediction unit.
Specifically, the structural prediction model is a network structure, and more specifically, a tree structure. Specifically, the prediction model is a prediction model using a kernel function, and more specifically, a prediction model using Gaussian process regression.
The data processing system 100 illustrated in
Here, a purpose of data processing is to analyze a quantitative relationship behind data that is called input and output, estimate, regress, and restore a relationship statistically, identify a structure of the relationship, and estimate output data paired with new input data based on the relationship. In general, when the output data is a value for a future time point, estimation of the output data is referred to as prediction. In particular, if not limited, it may be referred to as estimation, in addition to prediction.
Based on a prediction result, a power company enables smooth power supply and demand management. Some power company can accurately formulate and execute a generator operation plan for its own facility. In addition, the power company can accurately formulate and execute a power procurement transaction plan that entrusts power generation to other power companies.
The data processing system 100 includes a data analysis and prediction system 110, an information input/output terminal 120, a plan execution management device 130, a data observation device 140, and a data distribution device 150. The data analysis and prediction system 110, the information input/output terminal 120, the plan execution management device 130, the data observation device 140, and the data distribution device 150 are communicably connected via a communication path 101.
The communication path 101 is, for example, a local area network (LAN) or a wide area network (WAN). Alternatively, the communication path 101 may be another form as long as various devices and terminals constituting the data processing system 100 can be communicably connected to each other.
The data analysis and prediction system 110 includes a data storage device 111 and an analysis prediction calculation device 112.
The data storage device 111 can store data that constitutes an input and data of a prediction target which constitutes an output. The data constituting an input is observation data, distribution data, index data to data, and the like.
The data storage device 111 provides data for processing of analyzing a relationship between input data and output data and/or processing of estimating (or predicting) an output. Input data and output data provided for processing of analysis and/or estimation, or data to be recorded in preparation for provision to processing is referred to as “sample data”.
The data storage device 111 has a configuration in which a setting input including a storage range of the sample data can be received from the information input/output terminal 120. Data stored or output by the data analysis and prediction system 110 can also be displayed on the information input/output terminal 120.
As will be described later with reference to
The information input/output terminal 120 has a function of inputting settings to the data storage device 111, the analysis prediction calculation device 112, and the plan execution management device 130.
Based on the output calculated by the analysis prediction calculation device 112, the plan execution management device 130 generates and executes a physical facility operation plan for achieving a predetermined target. Here, in the energy field, the physical facility operation plan is, for example, a generator operation plan that satisfies a predicted future energy demand value, or satisfies an energy demand plan value which is generated based on the predicted future energy demand value. The operation plan may include a plan value of a power generation amount to be entrusted to a generator of another power company.
The data observation device 140 periodically measures a prediction target (not shown) and transmits measurement data to at least one of the data storage device 111 and the analysis prediction calculation device 112. The measurement data includes data of a measuring instrument for measuring power consumption, data of a power generation end meter which is a power generation amount of a generator connected to a power transmission line, data of a power generation market settlement price, and the like.
The data distribution device 150 receives data from the outside of the data processing system 100, and transmits the data to at least one of the data storage device 111 and the analysis prediction calculation device 112. In order to receive data, the data distribution device 150 is connected to at least one of the following devices all of which are not illustrated: a weather observation device and a numerical weather forecasting device, a weather measuring device disposed on a power transmission line (which measures weather data of temperature and water vapor content), a current measuring device for a power transmission line, a management device for a large demand facility, a management device for a power transaction market, a management device for a fuel transaction market, a management device for a charter business, a management device for a railroad business facility, and a management device for a commuting business facility. The weather observation device and the numerical weather forecasting device may be installed in a weather organization such as a weather company or a meteorological agency.
The data distribution device 150 receives at least one type of past weather record data, numerical weather forecast data, power transmission current data, operating data of a large demand facility, power transaction data, fuel transaction data, operating data of a charter for fuel transportation or the like, operating data for a railway business, and operating data of a communication business facility.
Further, the data distribution device 150 is connected to a data distribution device in a police station, a fire station, or a news medium such as a newspaper company, and receives data of events such as disasters, accidents, and amusement that are transmitted from these institutions.
The prediction target (output) of the data processing system 100 includes, for example, energy consumption data for power, gas, water and the like, data of energy production amount by solar power generation, wind power generation and the like, and, as an example, a transaction amount of energy and a power generation market settlement price that are traded at Japan Electric Power Exchange (JEPX).
Examples of the input include weather data such as temperature, humidity, solar radiation amount, wind speed, and atmospheric pressure, calendar date data of a flag value indicating the type of day arbitrarily set, such as date, or day of week, and data indicating the presence or absence of an unexpected incident such as a typhoon or an event.
In addition to these, the input also includes: data indicating an economic situation including the number of energy consumers, industrial trends, business condition indexes, and the like; data indicating vehicle occupancy, number of vehicle passengers, number of booked seats of a limited express train, or a move situation of a human, a moving body and the like, such as a road traffic condition; and data of free on board (FOB) prices, delivered ex ship (DES) prices, forward expiration month prices and the like for fuels such as crude oil, natural gas, and petroleum.
The data storage device 111 includes a central processing unit (CPU) 211, an input device 212, an output device 213, a communication device 214, and a storage device 215. The data storage device 111 is, for example, a data processing device such as a personal computer, a server computer, or a handheld computer.
The CPU 211 integrally controls operations of the data storage device 111. The input device 212 is a keyboard, a mouse, or the like. The output device 213 is a display, a printer, or the like. The communication device 214 includes a network interface card (NIC) for connecting to a wireless LAN or a wired LAN. The storage device 215 is a storage medium such as a random access memory (RAM), a read only memory (ROM), or a hard disk drive. The data storage device 111 may appropriately output an output result and an intermediate result of each processing unit via the output device 213.
In the storage device 215, databases of an observation data storage unit 221 and a distribution data storage unit 222 are stored.
In the observation data storage unit 221, a prediction target received from the data observation device 140 is periodically measured, and an index t (t is a vector when a plurality of pieces of information are to be indexed) for searching, such as a time point at which a value of measurement data y is observed and a location where the value of measurement data y is observed, is held. This held data is referred to as an output y(t).
In the distribution data storage unit 222, data received from the data distribution device 150 such as past weather record data, numerical weather forecast data, power transmission current data, operating data of a large demand facility, power transaction data, fuel transaction data, operating data of a charter for fuel transportation or the like, operating data for a railway business, and operating data of a communication business facility is held, with indexes t for searching names, generation time points, generation locations, and the like thereof added. This held data is referred to as an input x(t). Particularly, when x(t) is data of future prediction like numerical weather forecast data, it may be referred to as input x*(t). Particularly, the data may be referred to as “prediction input x*(t)” and data of future prediction.
The data analysis and prediction system 110 holds a record value y of the prediction target in the observation data storage unit 221, and outputs estimated data y*, which is a future value of the prediction target. The record value y of the prediction target is, for example, an output of a power demand measuring system for a power transmission line of the Kanto area, an output of a system that determines a sum of measuring instruments for a designated customer, and an output of a power generation market settlement price determining system. Since the data y* corresponds to an output of a device and a system in the background of the prediction target, the data y* may be referred to as data of output, output data, or simply output. The area includes a plurality of areas, such as Kanto area, Kansai area, and Hokkaido area. The data analysis and prediction system 110 can hold a record value y of the prediction target for each area, and perform processing of outputting the estimated data y* which is a future value.
The analysis prediction calculation device 112 includes a CPU 231, an input device 232, an output device 233, a communication device 234, and a storage device 235. The analysis prediction calculation device 112 is, for example, a data processing device such as a personal computer, a server computer, or a handheld computer. The CPU 231, the input device 232, the output device 233, the communication device 234, and the storage device 235 are basically the same as the CPU 211, the input device 212, the output device 213, the communication device 214, and the storage device 215.
In the storage device 235, various computer programs for a decision tree model generation unit 241, a data selection ordinal number calculation unit 242, a data and index selection unit 243, a selected data transfer processing unit 244, a prediction model identification unit 245, and a first prediction processing unit 246 are stored.
In addition, a computer program for an error evaluation unit 247 may be stored in the storage device 235. For example, feeding-back from the error evaluation unit 247 to the data and index selection unit 243 is performed.
In addition, various computer programs for a second prediction processing unit 248 and a superimposition processing unit 249 may be stored in the storage device 235. According to the second prediction processing unit 248 and the superimposition processing unit 249, for example, an output of prediction using n pieces of data for the whole year is compared with an output of precision prediction (prediction unit) in a short-term model using the latest n′ (<n) pieces of data, and if a gap therebetween is large, it can be detected that folding is insufficient in the short-term model.
The analysis prediction calculation device 112 may appropriately output an output result, an intermediate result and the like of each processing unit via the output device 233.
Processing and a data flow of the data analysis and prediction system 110 will be described with reference to
The data storage device 111 receives, from data distribution device 150, data of “input x” and/or data of “input x*” that is a prediction value for an input, and stores the data in the distribution data storage unit 222. The data storage device 111 receives data of “output y” from the data observation device 140 and stores the data in the observation data storage unit 221.
In the decision tree model generation unit 241, the analysis prediction calculation device 112 generates a decision tree model based on data in the observation data storage unit 221 and data in the distribution data storage unit 222. The decision tree model is a method of automatically extracting meaningful data classification rules such as regularity and relevance from a large amount of data.
The decision tree model generation unit 241 generates a decision tree model in which a classification target is taken as a discrete value. First, the decision tree model generation unit 241 collects the data of the prediction target in the observation data storage unit 221 as time-series data of a predetermined time length (for example, 24 hours, 12 hours, or 6 hours) (the data is referred to as “observation time-series data”), and discretizes the observation time-series data by clustering processing, in which a frequency spectrum is taken as a feature quantity, in accordance with a procedure of the flowchart of
When the acquired observation time-series data is classified into 1 to M clusters, the decision tree model generation unit 241 determines a cluster center set {Ck: k=1, 2, . . . N} of the clusters (where N is any value from 1 to M). A theoretical maximum of M is the total number of the observation time series, but M may be limited to the following values for the sake of simplicity.
More specifically, when such observation time-series data is classified into one cluster, a cluster center set of the cluster is {C1}, when such observation time-series data is classified into two clusters, a cluster center set of the clusters is {C1, C2}, and when such observation time-series data is classified into three clusters, a cluster center set of the clusters is {C1, C2, C3} . . . . As described above, while changing the number of clusters N from 1 to M, the decision tree model generation unit 241 divides the observation time-series data into clusters, and determines a cluster center set {C1, C2, C3, . . . , CN} of the corresponding number of clusters using a k-means method. {C1, C2, C3, . . . , CN} may be referred to as {Ck} (k∈{1, 2, . . . , N}.
The decision tree model generation unit 241 executes cluster number validity evaluation value calculation processing of calculating an index (hereinafter, referred to as a “validity evaluation value”) that is for evaluating which cluster number N is appropriate based on a processing result of the clustering processing described above. In the case of the present embodiment, the decision tree model generation unit 241 calculates, as such validity evaluation value, an intra-cluster matching degree representing a cohesion degree of observation time-series data in each cluster, and an inter-cluster average separation degree representing a degree of separation between clusters.
The decision tree model generation unit 241 determines an optimal number of clusters based on the intra-cluster matching degree and the inter-cluster average separation degree that are calculated in step S502.
With the above processing, the observation time-series data is classified into an appropriate number of clusters. Note that a technique disclosed in WO 2015/133635 can be appropriately incorporated into steps S501 to S503.
The decision tree model generation unit 241 assigns a cluster ID to a “leaf” of a cluster set of discretized observation time-series data.
Next, the decision tree model generation unit 241 takes the cluster IDs of the observation time-series data as teacher data, and generates a decision tree model for classifying the observation time-series data. More specifically, the decision tree model generation unit 241 takes data in the distribution data storage unit 222 as a predictor (branch condition), and generates a decision tree model TrM for classifying observation time-series data by using an algorithm for generating a decision tree model.
As an algorithm for generating a decision tree model, known classification and regiterator trees (CART) are generally used. In addition, an algorithm such as iterative dichotomiser 3 (ID3) or chi-squared automatic interaction detection (CHAID) may be used.
The decision tree model generation unit 241 generates, for example, a decision tree model in which a factor that is more dominant to determine a prediction target appears in a “branch” which is a branch located upstream. In other words, a branch identifying an output corresponds to an explanatory variable.
Here, a cluster ID, which is obtained by discretizing the above-described frequency spectrum as a feature quantity, can be set as the teacher data, and a main predictor can be extracted by making the decision tree model compact. However, for the sake of simplicity, the discretization processing may be omitted in generating a decision tree model for classifying observation time-series data.
With respect to a distribution data type and observation data, of a branch condition at each stage from a root to a leaf of the decision tree model TrM, the data selection ordinal number calculation unit 242 may give a guide value having a larger weight to a higher-rank branch. Preferably, an impurity decrement of data before and after classification at an intermediate node of the decision tree model, which is known as the Gini coefficient of the decision tree model, may be used as the guide value, or similarly an entropy decrement of a branch at the intermediate node may be used as the guide value. For data types that are branch conditions at a plurality of intermediate nodes, the guide value may be subjected to weighted addition.
The data selection ordinal number calculation unit 242 totals up the impurity decrements of data, which are caused by division, for all predictors (variables), and considers a value obtained by dividing the sum by the number of branch nodes, as an importance degree of the predictors (variables) in a learned tree. When the entropy decrement is used as a guide value in determination of a predictor used for a branch, the data selection ordinal number calculation unit 242 totals up entropy decrements, and considers a value obtained by dividing the sum by the number of branch nodes as an importance degree of the predictors (variables) in a learned tree.
The data selection ordinal number calculation unit 242 assigns ordinal numbers to be used for selection of data to each of the data types, in descending order of importance degree of the predictors. Alternatively, the data selection ordinal number calculation unit 242 may assign ordinal numbers in an order of branches of a learned tree (in an order of predictors surrounded in the predictor display 801, the predictor display 802, the predictor display 803, and the predictor display 804 in the example of
The data and index selection unit 243 determines a predictor of a branch condition of the decision tree model TrM and a value thereof, as data for selecting distribution data and observation data to be added to data used for identification of a prediction model to be described later. That is, the data and index selection unit 243 determines a data type list sM indicating types of the distribution data, and an index list sT indicating a set of indexes for the data. The data type list sM is a set indicating a type selected from the M types of distribution data stored in the distribution data storage unit 222.
In the following description, “power demand at 9:00” is taken as a prediction target, and as standard setting of types of distribution data to be used for identification of a prediction model to be described later, “power demand at 9:00, one day before”, “power demand at 9:00, 1 days before”, “power demand at 9:00, 2 days before”, “power demand at 9:00, 3 days before”, “power demand at 9:00, 4 days before”, “power demand at 9:00, 5 days before”, “power demand at 9:00, 6 days before”, “power demand at 9:00, 7 days before”, “Tokyo region temperature at 9:00” and “day type” are set.
Regarding the temperature, in a learning process, an actual temperature may be used instead of a forecast temperature. In addition, with respect to observation data to be used as sample data, standard setting is performed to select observation data in the latest 30 days from all the observation data. Data of rare frequency event is added to the data (standard setting data) selected based on the standard setting, and the obtained data is set as training data used for identification of the prediction model.
Details of the processing in step S404 will be described with reference to a flowchart of
The data and index selection unit 243 reads a data type x of a predictor of the first ordinal number.
The data and index selection unit 243 determines whether or not the data type x has been selected among data types of training data. The data and index selection unit 243 moves the processing to step S603 if it is determined that the data type x has been selected, and moves the processing to step S604 if it is determined that the data type x has not been selected.
The data and index selection unit 243 reads a data type of a predictor of the next ordinal number, and returns the processing to step S602.
The data and index selection unit 243 adds the selected data type x to a data type list sM in order to designate an item to be held in the training data table 1000. In the example of
The data and index selection unit 243 pre-searches stored data for each of the data types held in the training data table 1000. More specifically, the data and index selection unit 243 searches the data storage device 111 for a forecast value (prediction input x*(t)) at a time point t of a prediction target, for each predictor of the data types designated in the training data table 1000.
For example, the data and index selection unit 243 searches for “Tokyo region temperature at 9:00” to obtain a search result (forecast value) such as “9° C.”. The data and index selection unit 243 refers to a condition value (for example, for “Tokyo region temperature at 9:00” of a predictor whose ordinal number is “2” as illustrated in
When the sample is not included in the basic sample, the data and index selection unit 243 acquires index information of the observation time-series data classified into a subtree ahead of the branch of the decision tree model TrM, and adds the index information to an index list sT of the observation time-series data so as to become an additional sample (selected data) of the training data. For example, in the example illustrated in
For example, the data and index selection unit 243 searches for “Kanagawa region temperature at 3:00” to obtain a search result (forecast value) such as “17° C.”. The data and index selection unit 243 refers to a condition value (for example, for “Kanagawa region temperature at 3:00” of a predictor whose ordinal number is “4” as illustrated in
When the sample is not included in the basic sample, the data and index selection unit 243 acquires index information of the observation time-series data classified into a subtree ahead of the branch of the decision tree model TrM, and adds the index information to an index list sT of the observation time-series data so as to become an additional sample (selected data) of the training data. For example, in the example illustrated in
In the example illustrated in
As described above, in step S605, the index list sT is generated such that data lacking in the standard setting data is added based on a forecast value of a data type of the standard setting data. Further, in step S605, based on the generated decision tree model, a data type that is not the data type of the standard setting data is added to the data type list sM, and with respect to the added data type, an index list sT is generated such that data lacking in the standard setting data is added based on a forecast value of the added data type.
In addition of the data type and the data index, the data and index selection unit 243 determines whether or not the number of pieces of training data is equal to or less than an upper limit number NN (for example, 8000 pieces). When it is determined that the number of pieces of training data is the upper limit number NN, the data and index selection unit 243 returns the process to step S603, and generates a data type list sM and an index list sT, for the selected data whose number of pieces is up to a planned upper limit.
Preferably, the upper limit number NN of the training data may take a form of being changeable as a parameter, and may have a small value (for example, 500) as an initial value and be increased in a range where decrease of an error evaluation value delta of the error evaluation unit 247 to be described later continues. Thus, the identification of the prediction model by the necessary and sufficient training data is executed.
The selected data transfer processing unit 244 selects data of “input” and “output” as selected data at least in accordance with the data type list sM and the index list sT of the selected data, and acquires the selected data from the data storage device 111 via the communication device 234 and the communication device 214. In addition to the index list sT, a period such as the latest two weeks is set as a period of data to be used in a standard manner (standard setting data) and is set as a data index, and data of the corresponding index is acquired from the data storage device 111.
The prediction model identification unit 245 identifies a prediction model for calculating a prediction value of the prediction target, by using the above-described selected data and standard setting data (xi, yi) [i∈sM×(sT∪sTs)] (a set of this group of data is referred to as training data). With respect to identification of the prediction model, for example, in a case where data taken as explanatory variables has two types of x1 and x2, and in a case where the prediction model is a multiple regression model of a multivariable regression model, the prediction model is given by the following formula (1).
y*=a×x1+b×x2+c Formula (1)
y*: objective variable
a, b: partial regression coefficient
c: constant term (intercept)
x1, x2: explanatory variable
The prediction model of a prediction target is not limited to the above-described model, and other known methods may be applied. The known methods are exemplified below. For example, a method of assuming linearity, including a linear regression model such as a multiple regression model, and a generalized linear model such as a logistic regression; a method of assuming autoregressiveness, such as an auto regressive with exogenous (ARX) model; a method of using reduced estimator, such as Ridge regression, Lasso regression, and ElasticNet; a method of using dimensional reducer, such as partial least squares method and principal component regression; and a method called non-parametric such as nonlinear model using polynomial, support vector regression, regression tree, Gaussian process regression, and neural net. Preferably, the prediction can be achieved with high accuracy by applying an algorithm (kernel function prediction method) using a kernel function including Gaussian process regression by regression from data of an approximate output of the Gaussian process. The prediction model identification unit 245 of the present embodiment outputs an identified Gaussian process regression model GpM (Gauss Pseudo-spectral Method).
Note that, in general, a random variable is a variable whose value is determined by a result of a random trial, while a set {X (t)|t∈T} of random variables indexed by a parameter set T is called a stochastic process. When T represents time, the stochastic process is a sequence of values that change randomly in accordance with the passage of time.
However, in the present embodiment, T is not limited to a set indicating time. Here, t E T may be an index that specifies data, with respect to input data and output data (observation data or distribution data of the prediction target). For example, it may be a region index or a spatial coordinate index, a time point index, a group number index for the region index and the time point index, a measuring instrument index that specifies each data observation device, an index z (z∈Z, Z={t|X (t)⊆Y}) indicating that a value x of prediction target is in a specific range, or a predictor indicating branch information of a tree structure for classifying values of prediction target.
The first prediction processing unit 246 of the analysis prediction calculation device 112 calculates an output y*, which is a prediction value of a prediction target, by using an input x* of future data such as a future temperature, an input x of past distribution data and the Gaussian process regression model GpM.
Note that an output y of a past prediction target and an output y* of prediction performed in the past may be included in the data to be taken as the input x. For example, a demand value y (t12) at 12:00 of a day before the day on which prediction is to be executed is taken as one of the elements of the input x (x is a vector).
The prediction means that, for example, values of elements x1* and x2* of the input x* are substituted into x1 and x2 respectively in Formula (1) and a value of y is calculated and output as the output y*.
The analysis prediction calculation device 112 uses the second prediction processing unit 248 to calculate an output y˜, which is a second prediction value of a prediction target, by using the decision tree model TrM, an input x* of future data such as a future temperature and/or an input x of past distribution data, and observation data y that is an output of a past prediction target. For example, the analysis prediction calculation device 112 sequentially determines branch conditions of a decision tree using the distribution data and the observation data, and performs prediction. Further, when a value of a branch condition is not determined, the analysis prediction calculation device 112 performs a prediction calculation, which is known as a Bayes optimal prediction algorithm based on a decision tree model.
The analysis prediction calculation device 112 may use the error evaluation unit 247 to select data in the observation data storage unit 221 and the distribution data storage unit 222 by a predetermined plurality of sets (for example, 20 sets) by using a random number, try to perform prediction based on that data, and output, as an error evaluation value, an average value of prediction errors obtained by comparing a prediction result thereof with an actual past output y of a prediction target.
The superposition processing unit 249 outputs a graph (superimposition graph) obtained by superimposing information related to the output y* of the first prediction processing unit 246 on information related to the output y˜ of the second prediction processing unit 248.
In general, a stochastic process means a random variable that changes over time, and a Gaussian process is one type of stochastic process of continuous time. When a linear combination made by randomly selecting (finite number of) Xt1, . . . , Xtk from a stochastic process {Xt} t∈T follows a normal distribution, {Xt} t∈is called a Gaussian process.
The data analysis and prediction system 110 outputs a prediction value for a 4-hour-after power demand. A measurement control device 1210 measures a current power generation output of a first generator 1220 that is normally used, and a output change speed thereof, which is a possible change amount during 4 hours of power generation output, and performs prediction control for commanding activation of a spare generator (for example, second power generator 1230) when a power generation capacity for satisfying the 4-hour-after demand is insufficient. The power generated by the first generator 1220 and the second generator 1230 is boosted by a transformer facility 1240, and is transmitted via a power transmission network 1250.
The data analysis and prediction system 110 may be summarized as follows.
[1] The data analysis and prediction system 110 includes a structure analysis unit that predicts (classifies) data of a prediction target (or prediction output, prediction value, prediction data, output data, and output) by using a structure of a decision tree model using explanatory variables (or predictor, input data, and output). Further, the data analysis and prediction system 110 includes a data selection unit that analyzes data of a long period (one year to two years), and determines conditions such as a type of data necessary for prediction and a sampling time point and location of data. Preferably, the data analysis and prediction system 110 includes a variable and index determination unit that determines indexes for a type of explanatory variable and a period and location added to the explanatory variable, in the prediction processing, based on the output of the structure analysis unit.
[2] The data analysis and prediction system 110 includes a kernel function prediction unit that uses the data selected by the data selection unit to identify and predict a prediction model that uses the kernel function.
[3] The data analysis and prediction system 110 preferably includes a prediction unit that is based on a decision tree model.
[4] The data analysis and prediction system 110 preferably includes a prediction display unit that displays information on a prediction output based on the kernel function and information on a prediction output based on the decision tree model.
Not only a memory is required in proportion to a number M of types of sample data adopted as training data in statistical machine learning using a kernel function, but also a memory and a calculation amount are required in proportional to the square of a number K of samples used. In one example, in order to handle five-minute data for one year which is generated from measuring instrument signals, n=105120, and approximately 800 terabytes of memory is required. Therefore, ad hoc selection of sample data is performed, that is, the sample data is limited to the latest period, which hinders highly accurate prediction.
In an example of application of the invention, based on sample data of power generation market settlement price, an occurrence of off-shore waiting time (waiting time of waiting at sea for unloading of transportation fuel) exceeding a normal reference value for a tanker and an occurrence of the amount of solar radiation exceeding an annual average are subjected to structure analysis as predictors of a higher-rank ordinal number, and indexes of sample data corresponding to these occurrences are added to a selected index set and automatically transferred to a K x K statistic analysis processing unit of an analysis and prediction device.
According to the present system, by structure analysis (that is, generation of a decision tree model) of the sample data over a long period (in one example, two years), ordinal numbers can be given to the predictors (conditional branches in the structure analysis), and sample data for which the predictor has a significant value can be added, related to an item i type (i E M) of input data corresponding to a predictor with a higher-rank ordinal number. Compared with virtual prediction using all the samples (the number of samples N), the memory amount and the calculation amount can be reduced in proportion to the square of K (reduction amount=N2−K2), and highly accurate prediction can be realized which incorporates a causal relationship that is lacking in the sample data in the latest period.
Further, as illustrated in
The above is the description of the effect of being able to reduce the explanation of the prediction value and the error of the prediction value by the data analysis and prediction system of the present embodiment.
In the background where the data analysis and prediction system is found to be beneficial, there is a recent social environment in which emergency power interchange is difficult, a cause of which is that there is also a change in a power supply system including separation of electrical power generation from power distribution and transmission. That is, in an electric power company, an actual situation of the company for the three businesses of power generation, power transmission and distribution, and electric power sale has been divided into three in recent years, whereas in the past it was easy to perform control quickly with a single management.
According to this example, there is a circumstance where quick control for emergency electric power interchange is difficult due to separation of electrical power generation from power distribution and transmission like the three divisions, and the cost directly increases. On the other hand, the present data analysis and prediction system realizes highly accurate power demand prediction capable of predicting and reducing emergency power interchange, thereby making contributions to society.
Further, in the background where the data analysis and prediction system is found to be beneficial, thanks to the high integration of computer integrated circuits in recent years, the prediction by various regression models instead of theoretical formulas can correspond to processing time in actual work; on the other hand, with regressions where theoretical formulas and structural models are not specified, there is no means for confirming whether interpolation or extrapolation of data suitable for emergency is performed, and it is not suitable for abnormal processing in actual work. On the other hand, the present data analysis and prediction system reads a predictor that is input data arranged in a forward order in a tree structure, so as to allow the user to confirm a stage until determination of the prediction value, thereby realizing appropriate business execution based on the prediction value for the user and making contributions to society.
The embodiment described above includes, for example, the following contents.
Although a case where the invention is applied to a data processing system is described in the embodiment described above, the invention is not limited thereto, and can be widely applied to various other systems, devices, methods, and programs.
In the above-described embodiment, the functions (the observation data storage unit 221, the distribution data storage unit 222, and the like) of the data storage device 111 may be implemented by the CPU 211 reading a program (software) stored in a ROM into a RAM and executing the program, may be implemented by hardware such as a dedicated circuit, or may be implemented by a combination of software and hardware, for example. A part of the functions of the data storage device 111 may be implemented by another computer capable of communicating with the data storage device 111.
In the above-described embodiment, the functions (decision tree model generation unit 241, data selection ordinal number calculation unit 242, data and index selection unit 243, selected data transfer processing unit 244, prediction model identification unit 245, first prediction processing unit 246, and the like) of the analysis prediction calculation device 112 may be implemented by a CPU reading a program (software) stored in a ROM into a RAM and executing the program, may be implemented by hardware such as a dedicated circuit, or may be implemented by a combination of software and hardware, for example. Apart of the functions of the analysis prediction calculation device 112 may be implemented by another computer capable of communicating with the analysis prediction calculation device 112.
Further, in the above-described embodiment, the configuration of each table is an example, and one table may be divided into two or more tables, or all or a part of the two or more tables may be one table.
Further, although various types of data are described using the XX table in the above-described embodiment, the data structure is not limited and may be represented as XX information or the like.
In the above description, information such as a program, a table, and a file for implementing functions can be stored in a storage device such as a memory, a hard disk, or a solid state drive (SSD), or can be stored in a recording medium such as an IC card, an SD card, or a DVD.
The above-described embodiment has, for example, the following characteristic configuration.
A data processing system (for example, the data processing system 100, the data analysis and prediction system 110) that performs prediction using a prediction model (for example, a prediction model of a method of assuming linearity, a method of assuming autoregressiveness, a method of using reduced estimator, a method of using dimensional reducer, a method called non-parametric, or a method using a kernel function), includes a selection unit (for example, the analysis prediction calculation device 112, the data and index selection unit 243, and the selected data transfer processing unit 244) that selects data to be used for identification of the prediction model from a storage unit (for example, the storage device 235, the analysis prediction calculation device 112, the storage device 215, the data storage device 111, the data observation device 140 and the data distribution device 150) that stores data, and a processing unit (for example, the analysis prediction calculation device 112 and the prediction model identification unit 245) that uses data selected by the selection unit to identify the prediction model. The selection unit selects, from the storage unit, predetermined first data (for example, data of a predetermined period, standard setting data), and second data (for example, selected data) of a type (for example, data type) and/or condition (for example, a value of a branch condition) different from the first data, based on a branch condition of structure data of a structural prediction model.
With the above configuration, for example, the predetermined first data, and the second data of a type and/or condition different from the first data, are used for identification of the prediction model, and highly accurate prediction is realized which incorporates a causal relationship that is lacking in the predetermined first data. With the above configuration, since it is possible to avoid a situation where data of a rare frequency event is missing from the data used for identification of the prediction model, it is possible to reduce consumption of the memory and improve accuracy of prediction, for example, by adopting a prediction model using the kernel function to shorten a sampling period.
There are provided a generation unit (for example, the analysis prediction calculation device 112, the decision tree model generation unit 241) that uses the data stored in the storage unit as structure data of the structural prediction model to generate a decision tree model that appears at a higher rank as a predictor that is a branch condition for dominantly determining a prediction target (for example, energy consumption data for power, gas, water and the like, data of energy production amount by solar power generation, wind power generation and the like, and a transaction amount of energy and a power generation market settlement price that are traded at Japan Electric Power Exchange (JEPX)), and an assignment unit (for example, the analysis prediction calculation device 112, the data selection ordinal number calculation unit 242) that assigns an ordinal number to be used for selection of data in the selection unit to a predictor in the decision tree model generated by the generation unit. The selection unit selects the second data from the storage unit until a predetermined number (for example, the upper limit number NN) is reached according to the ordinal number assigned by the assignment unit.
With the above configuration, by generating the decision tree, the ordinal number is assigned to the predictor, and data for which the predictor with a higher-rank ordinal number has a significant value is selected. For example, in a case where identification of the prediction model using a kernel function is performed, the memory amount and the calculation amount are reduced (reduction amount=N2−K2) in proportion to the square of a sum (K) of the number of pieces of the first data and the number of pieces of the second data, as compared with the virtual prediction using all the data (N).
Here, a predictor (explanatory variable) relating to a rare frequency event has a relatively high importance degree. Therefore, for example, when the ordinal number is given in descending order of importance degree of the predictors, even if the sampling period (predetermined period) is shortened, the second data for which a predictor with a higher-rank ordinal number has a significant value is used in the identification of the prediction model, and thus it is possible to avoid a situation where data of the rare frequency event is missing.
As described above, according to the above configuration, it is possible to reduce the consumption of the memory and avoid a situation where the data of the rare frequency event is missing from the data used for identification of the prediction model.
The prediction model is a prediction model using a kernel function.
With the above configuration, since the prediction using a kernel function is performed, highly accurate prediction is realized as compared with multiple regression prediction and Bayes optimal prediction using a decision tree model.
Further, there are provided a second processing unit (for example, the analysis prediction calculation device 112, the second processing unit 248) that performs prediction using the decision tree model that is generated by the generation unit using the data stored in the storage unit, and an output unit (for example, the analysis prediction calculation device 112, the superimposition processing unit 249) that performs outputting. The processing unit (for example, the first prediction processing unit 246) performs prediction using the prediction model, and the output unit outputs a prediction result of the processing unit and a prediction result of the second processing unit.
The output unit may display the prediction result of the processing unit and the prediction result of the second processing unit on the information input/output terminal 120, may transmit the prediction results as a file to the information input/output terminal 120, may print the prediction results using the output device 233, or may output the prediction results in other forms.
With the above configuration, since the prediction result of the processing unit and the prediction result of the second processing unit are output, the user can confirm that there is no large difference in the prediction results, for example, when these results are displayed in a superimposed manner. Further, when there is a difference in the prediction results, the user can confirm that overlearning has occurred due to deviation of the selected data or that the selected data is insufficient.
In addition, the above configuration may be modified, rearranged, combined, or omitted as appropriate without departing from the scope of the invention.
It should be understood that items included in a list in the form of “at least one of A, B, and C” can mean “A”, “B”, “C”, “A and B”, “A and C”, “B and C” or “A, B, and C”. Similarly, items listed in the form of “at least one of A, B, or C” can mean “A”, “B”, “C”, “A and B”, “A and C”, “B and C” or “A, B, and C”.
Number | Date | Country | Kind |
---|---|---|---|
2020-021959 | Feb 2020 | JP | national |