1. Field of the Invention
The present invention relates to a neural network system for predicting time series data with increased accuracy.
2. Description of the Related Art
Accurate prediction of future values of time series data such as stock prices, vehicular traffic volumes, communication traffic volumes, and other values that vary with time is necessary in order to prepare for impending events or detect abnormal behavior. One known method of time series prediction is to create a mathematical model such as an autoregressive moving-average (ARMA) model or a neural network model and train the model on existing data.
It is known that neural networks can perform flexible information processing tasks that would be difficult for a conventional von Neumann type computer. Many types of neural network systems have been proposed.
For example, in Japanese Patent Application Publication No. H06-175998 (FIG. 1) Ohara proposes a method in which past and current time series patterns are input to a neural network consisting of an input layer, an intermediate layer, an output layer, and a context layer, back-propagation is carried out to train the neural network, and the trained neural network is used for time series prediction. A problem with this method is that when the time series data vary in intricate and ever-changing ways, adequate feature vectors for training the model cannot be obtained, leading to predictions of low accuracy.
In Japanese Patent Application Publication No. H11-212947 (now Japanese patent No. 3567073), Shiromaru et al. propose a neural network model in which the input time series data are first analyzed (filtered) to express the data as a sum of high, medium, and low frequency components. Each frequency component is input to a separate neural network with input, intermediate, and output layers. The predictions made by the separate neural networks are added together to obtain a final prediction. This method provides more accurate predictions than Ohara's method, but since each neural network is trained independently, it is not possible to train the system to predict the behavior of one frequency component from the behavior of another frequency component.
Analyzing a time series into frequency components is one type of multiresolution analysis, the different frequency components representing different levels of analysis. It would be desirable to have a neural network system that could be trained to predict time series data by treating the behavior of the time series at different levels of analysis as interrelated phenomena instead of as independent phenomena.
An object of the present invention is to predict time series data more accurately.
The invention provides a novel neural network system for this purpose.
The input unit of the system receives analyzed data obtained by multiresolution analysis of the time series data. The analyzed data include data for different levels of analysis, from a highest level to a lowest level, indicating frequency characteristics of the time series data.
The analyzed data are processed by a processing unit including at least an input processing layer. The input processing layer generates output data by operating on the analyzed data received for a descending series of levels, starting from the highest level, to obtain an output value for each level in the series. At each level below the highest level, the output value is obtained by operating on the analyzed data of that level and the output value obtained from the next higher level in the series.
The series of levels may include all levels from the highest level to the lowest level.
The data output by the input processing layer may be the output value obtained at the lowest level in the series. Alternatively, the input unit may also receive correlated data related to the time series data, and the processing unit may also include a correlated data processing section that processes the correlated data and the output value obtained at the lowest level in the series to obtain the output data.
The novel apparatus may also include an intermediate processing layer that processes the output data obtained by the input processing layer over a predetermined most recent interval of time. The output of the intermediate processing layer may then be further processed to generate the predicted value.
The multiresolution analysis may be a wavelet analysis and the analyzed data may include wavelet coefficients.
By interrelating the different levels of analysis in the input processing layer, the novel apparatus can produce more accurate predicted values than conventional apparatus.
If the multiresolution analysis is a wavelet analysis, the multiresolution analysis can be carried out quickly.
Since most of the processing is completed in the input processing layer, the intermediate processing layer and any further processing layers can have a simple structure.
In the attached drawings:
As an embodiment of the invention, a novel neural network system for predicting time series data will now be described with reference to the attached drawings, in which like elements are indicated by like reference characters.
Referring to
The input unit 110 receives analyzed data and outputs the data in a form that can be processed by the processing unit 120. The analyzed data have been generated by multiresolution analysis (MRA) of the time series data. More specifically, the analyzed data are wavelet coefficients w(L)i to w(1)i obtained by wavelet analysis of the time series data. The input unit 110 also receives the scaling coefficients s(L)i for the highest wavelet analysis level and correlated data nt. The correlated data nt are arbitrary data related to the time series data.
The processing unit 120 processes the data received by the input unit 110 to generate an output value that is supplied to the output unit 130 as a prediction of the next value in the time series. The processing unit 120 is a model-based learner incorporating a neural network that has been trained on previous time series data. The neural network is a network of processing elements conventionally referred to as neurons, because they are modeled on neurons in the brain. Each neuron, indicated as a circled N in the drawing, generates an output value by a process including weighted addition of multiple inputs. The output value may become an input to another neuron.
The neural network is trained by back propagation. This well known training algorithm takes the difference between predicted values calculated by the processing unit 120 and known correct answer data y, calculates local error values representing the difference between the actual and desired outputs of each neuron, and adjusts the weighting coefficients of the neurons so as to reduce the local differences, proceeding in reverse order to the order of processing followed in the prediction process.
The processing unit 120 includes an input processing layer 121, an intermediate processing layer 124, an output processing layer 125, and a plurality of delay processing units 126. The input processing layer 121 includes an analyzed data processing section 122 and a correlated data processing section 123.
The analyzed data processing section 122 processes the input wavelet coefficients and scaling coefficients. The processing is done by a series of neurons, including one neuron for each level in the wavelet analysis that produced the wavelet coefficients. The neurons are interconnected in descending order of level, corresponding to ascending order of frequency. The first neuron in the series receives the wavelet coefficients and scaling factors for the highest wavelet analysis level, representing the lowest analyzed frequency component of the time series. Each other neuron in the series receives the wavelet coefficients for its own level and the output of the preceding neuron in the series.
The correlated data processing section 123 receives the output of the last neuron in the analyzed data processing section 122, which processed the highest-frequency wavelet coefficients, and the correlated data nt. The correlated data processing section 123 consists of a single neuron, which produces the output of the input processing layer 121.
The operations performed in the input processing layer 121 will be described in more detail later.
The intermediate processing layer 124 and output processing layer 125 consist of one neuron each. The intermediate processing layer 124 operates on a certain number of consecutive outputs of the input processing layer 121, which are held in a delay processing unit 126. The output processing layer 125 operates on the output of the intermediate processing layer 124, and provides the final output of the processing unit 120 as a predicted time series value to the output unit 130. The output unit 130 converts the final output to an appropriate signal that is supplied to, for example, an external device (not shown).
The delay processing units 126 store data temporarily. At any given time t, the delay processing units 126 store the data needed by the neurons to calculate a predicted value for the time series at time t+1.
Referring to
The processing unit 120 may be configured from specialized hardware, or from general-purpose computing and control hardware, such as a computer including a central processing unit (CPU), that executes programs stored as software or firmware to implement the processing performed by the neurons and delay units in
Next, the operation of the neural network system 100 will be described. In a stage preceding the neural network system 100, a wavelet transformation is carried out on sampled and quantized time series data to obtain wavelet coefficients and scaling coefficients that represent the result of multiresolution analysis.
The wavelet coefficients are related to the original time series signal f (t) as follows, where t is a time variable and L represents the highest level of analysis.
The quantity gj(t) in equation (1) can be expressed in terms of wavelet coefficients wj, k and mother wavelets ψj, k as in equation (2). The quantity fL(t) in equation (1) can be expressed in terms of scaling coefficients sL, k and mother wavelets ψL, k for analysis level L as in equation (3).
The scaling function is the Haar function φ(u) shown in
A family of mother wavelets derived from the Haar function for four levels of wavelet analysis is shown in
The calculation of wavelet coefficients for time series data {1, 3, 5, 11, 12, 13, 0, 1} corresponding to times t−7 to t is illustrated in
{1×(−1)}+{3×1 }=2.
The scaling coefficient is 21/2 so the wavelet coefficient is 2/21/2=21/2=1.4142, shown as w(1)i−3 in
At the second level of analysis (level 2), the width of the mother wavelet is doubled to (−1, −1, 1, 1), its inner product with four time series data values is taken, and the result is divided by a scaling coefficient equal to 4.
The wavelet coefficients calculated in this way are input to the input unit 110 together with the highest-level scaling coefficient. The input unit 110 processes these inputs and sends the resulting signals as data to the processing unit 120. Similar signal processing is carried out on the correlated data nt. The data output by the input unit 110 are temporarily stored in delay processing units 126 as explained above. These delay processing units 126 accordingly store the wavelet coefficients, scaling coefficients, and correlated data nt received over a certain interval extending back from time t.
The analyzed data processing section 122 in the processing unit 120 operates on the wavelet coefficients and scaling coefficients held in the delay processing units 126, using a separate neuron for each level of analysis.
A general neuron can be represented as in
First the top level neuron in the analyzed data processing section 122 in
The neuron on level L-1 also receives the wavelet coefficients w(L−1)i etc. for this level and carries out a similar operation to obtain an output value o(L−1)i, which is supplied to the neuron on the next lower level (L-2). This process continues until an output value o(1)i is obtained for the lowest level as the output of the analyzed data processing section 122.
Owing to this passing of output values from higher-level neurons to lower-level neurons in the analyzed data processing section 122, each neuron can incorporate the results of the calculations carried out on the higher levels into its own calculations, so that the predictions made for the different levels of analysis are interrelated.
The data o(1)i output from the analyzed data processing section 122 is supplied to the correlated data processing section 123. The correlated data processing section 123 also receives correlated data for a certain interval of time extending back from the current time t, these data being stored in another delay processing unit 126. In the prediction of packet traffic volume under the real-time transport protocol (RTP), for example, the correlated data nt may be the number of session initiation protocol (SIP) packets transmitted during the interval of time, SIP packets being call control packets transmitted when a communication session begins. Alternatively, the correlated data may be the number of sessions, or the current time. The neuron in the correlated data processing section 123 processes the correlated data and the output data o(1)i received from the analyzed data processing section 122 in the general manner illustrated in
The data output from the input processing layer 121 over a predetermined interval of time are temporarily stored in yet another delay processing unit 126, and supplied to the intermediate processing layer 124.
The intermediate processing layer 124 in this embodiment has a single neuron that operates on the data stored in the delay processing unit 126, and produces a predicted time series value for time t+1. This predicted value is supplied to the output unit 130 and placed in, for example, a signal sent to an external device (not shown).
Time series values predicted by the novel neural network system 100 are compared with observed values and values predicted by a conventional apparatus in
The mean square error of the predicted values in
The basic reason for the improved prediction accuracy of the novel apparatus is thought to be that each level of analysis makes use of the prediction results at higher levels of analysis. Another factor is that similar use of the prediction results at higher levels is made during the training of the neural network. A further factor is the provision of a correlated data processing section that modifies the prediction made by the analyzed data processing section according to correlated data.
An advantage of the use of a wavelet transformation to perform the multiresolution analysis and the use of wavelet coefficients as input data is that the multiresolution analysis process can be completed quickly, even if there are many levels of analysis.
The invention is not limited to the use of wavelets derived from the Haar function. Other types of wavelets may be used, or a type of multiresolution analysis other than wavelet analysis may be used.
It is not necessary to use all levels of the multilevel analysis. A level selection unit can be added to the novel neural network system. During the training process, the level selection unit selects the levels to use. If, for example, the highest level is level M and the level selection unit selects levels M-1, M-3, and M-4, then the neuron at level M uses the level-M wavelet coefficients and scaling coefficients to obtain an output value o(M), which is input to the neuron at level M-1; then the output o(M−1) of the neuron at level M-1 is input to the neuron at level M-3, bypassing level M-2, the output o(M−3) of the neuron at level M-3 is input to the neuron at level M-4, and the output o(M−4) of the neuron at level M-3 is input to the correlated data processing section. The output o(M−2) of the neuron at level M-2 is discarded.
Those skilled in the art will recognize that further variations are possible within the scope of the invention, which is defined in the appended claims.
Number | Date | Country | Kind |
---|---|---|---|
2009-214643 | Sep 2009 | JP | national |