This application claims foreign priority of Chinese Patent Application No. 202110978765.2, filed on Aug. 25, 2021 in the China National Intellectual Property Administration, the disclosures of all of which are hereby incorporated by reference.
The present invention belongs to the technical field of data analysis of loads of electric vehicles, relates to a prediction method for charging loads of electric vehicles, and particularly relates to a prediction method for charging loads of electric vehicles with consideration of data correlation.
As the energy and environment problems are increasingly prominent, in order to implement the national energy development strategy and construct a modern energy system which is clean, efficient, safe and sustainable, electric vehicles have been developed energetically. From 2018 to 2020, in public service vehicles, the newly increased number of electric vehicles each year is increased to 30%-50%. On March 20, in the Sub-Forum of ‘New Revolution of Automobile Industry’ of 2021 Annual Meeting of China Development High-Level Forum, Yongwei Zhang, who is the vice president and the secretary-general of the 100-People Meeting of Electric Vehicles of China, expressed that holdings of electric vehicles of China should be within a range of 80,000,000 before and after 2030 according to the prediction. The popularization of the electric vehicles has a great effect on the structure of a power demand side, which can cause new growth points of power demands and loads in a period of time in the future.
Charging behaviors of the electric vehicles have the characteristics of randomness and fluctuation, and the charging features are possibly constrained by multiple factors, such as habits of users, the SOC (State of Charge) of a system and the like. As the electric vehicles are gradually large-scale, the disorderly charging and randomness of the electric vehicles cause relevant problems, such as the increase of a peak load of a power grid, unbalanced operation of a power distribution network, harmonic waves in the system and the like. Meanwhile, the electric vehicles, serving as mobile energy storage equipment, can provide assistance in the aspects of peak clipping and valley filling of the power grid, collaborative consumption of new energy and the like after reasonable charging management is realized. However, the existing prediction method for charging loads of the electric vehicles has the defects that the prediction is very difficult, the reliability of the prediction is not high, etc.
In order to overcome the defects in the prior art, the present invention provides a prediction method for charging loads of electric vehicles with consideration of data correlation, which is reasonable in design, simple and convenient in use and reliable in prediction results.
The present invention adopts the following technical solutions to solve the practical problems:
the prediction method for the charging loads of the electric vehicles with consideration of the data correlation comprises the following steps:
Step 1: collecting historical data of charging loads of electric vehicles;
Step 2: carrying out data correlation analysis on the historical data of the charging loads of the electric vehicles, which is collected in Step 1, and real-time data, and calculating correlation coefficients between the historical data of the charging loads of the electric vehicles and the real-time data;
Step 3: according to correlation coefficients obtained through calculation in Step 2, selecting historical data of the charging loads of the electric vehicles, which has high correlation, as data of the charging loads of the electric vehicles, which is used for prediction;
Step 4: predicting the historical data of the charging loads of the electric vehicles, which has high correlation and is selected in Step 3, serving as the data of the charging loads of the electric vehicles, which is used for prediction, by adopting an LSTM (Long Short Term Memory) algorithm, to obtain prediction results.
Moreover, a specific method of the Step 1 comprises: collecting the historical data of the charging loads of the electric vehicles of that very day and ten typical days at a certain area.
Moreover, a specific method of the Step 2 comprises: calculating the correlation of historical data of the charging loads of the electric vehicles of each day and real data of that very day by utilizing Excel software, to obtain the correlation coefficients between the historical data of the charging loads of the electric vehicles and the real-time data, wherein the calculation formula is:
wherein rxy represents a correlation coefficient of samples; Sxy represents the sample covariance; Sx represents the sample standard deviation of x; and Sy represents the sample standard deviation of y. In this case, x represents the data of the ten typical days, and y represents the data of that very day.
Moreover, a specific method of the Step 3 comprises:
according to a sequence of the correlation coefficients from small to big, selecting top five groups of data with the biggest correlation coefficients, i.e., five groups of data with the highest correlation, as the data of the charging loads of the electric vehicles, which is used for prediction.
Moreover, the Step 4 specifically comprises the following steps:
(1) inputting the data xt of the charging loads of the electric vehicles, which is used for prediction and is obtained in Step 3, and carrying out processing of a forgetting stage of a forgetting gate on load data xt of each time point firstly, wherein a calculation formula is shown as follows:
f
t=σ(Wf·[ht-1,xt]+bf)
(2) then, carrying out processing of a cell state updating stage of an input gate on ft, wherein a calculation formula is shown as follows:
C
t
=f
t
*C
t-1
+i
t
*{tilde over (C)}t
(3) finally, carrying out processing of an output stage of an output gate on Ct, wherein calculation formulas are shown as follows:
0t=σ(Wo·[ht-1,xt]+bo)
h
t=0t*tan h(Ct)
(4) taking load data obtained after the load data of one time point is processed by the three gate stages as legacy information ht-1 of a previous cell, and enabling the legacy information ht-1 and load data of a new time point to participate in recursive processing of the three gate stages again, to obtain load prediction values ht of 96 time points in one day finally.
The present invention has the advantages and beneficial effects that:
According to the prediction method for the charging loads of the electric vehicles with consideration of the data correlation, which is proposed by the present invention, the data correlation analysis is carried out on the historical data of the charging loads of the electric vehicles and the real-time data, and the data with the biggest correlation coefficients is selected as the load data used for prediction, so that the work load of data processing can be effectively reduced, the prediction method is simplified, and the predication accuracy is improved. Reasonable prediction of charging demands of the electric vehicles has important significance for the aspects of stable operation of a power grid, dispatching of the charging loads of the electric vehicles, researching of an orderly charging strategy and the like.
Embodiments of the present invention are further described in detail below through combination with the drawings.
A prediction method for charging loads of electric vehicles with consideration of data correlation, as shown in
Step 1: collecting historical data of the charging loads of the electric vehicles.
In the embodiment, research objects are collected, namely, historical data of charging loads of electric vehicles at a certain area is collected as basic data for correlation processing.
The research objects are collected, namely, data of charging loads at a certain area of that very day and ten typical days ((D-1)-(D-10)) is collected as basic data for correlation processing.
Step 2: carrying out data correlation analysis on the historical data of the charging loads of the electric vehicles, which is collected in the Step 1, and real-time data, and calculating correlation coefficients between the historical data of the charging loads of the electric vehicles and the real-time data.
In the embodiment, the correlation of the historical data of the charging loads of the electric vehicles of each day and real data of that very day is calculated by utilizing Excel software, to obtain the correlation coefficients between the historical data of the charging loads of the electric vehicles and the real-time data.
The data correlation analysis is carried out on the historical data (i.e., the basic data) of the charging loads of the electric vehicles and the real real-time data of that very day, and the correlation of the historical data of the charging loads of the electric vehicles of each day and the real data of that very day is calculated by utilizing the Excel software, to obtain the correlation coefficients between the historical data (i.e., the basic data) of the charging loads of the electric vehicles and the real-time data.
A correlation coefficient method is adopted in the present invention, the correlation coefficient refers to a statistical index reflecting the intimacy level of the relation between variables, and the value interval of the correlation coefficient is 1−(−1); 1 represents that the two variables are in perfect linear correlation, −1 represents that the two variables are in perfect negative correlation, and 0 represents that the two variables are uncorrelated; and the closer the data is to 0, the weaker the correlation is.
The calculation formula of the correlation coefficient in the Step 2 is shown as (1):
wherein rxy represents a correlation coefficient of samples; Sxy represents a sample covariance; Sx represents a sample standard deviation of x; Sy represents the sample standard deviation of y; and in such the situation, x represents the data of the ten typical days, and y represents the data of that very day.
Step 3: according to the correlation coefficients obtained through calculation in the Step 2, selecting historical data of the charging loads of the electric vehicles, which has high correlation, as data of the charging loads of the electric vehicles, which is used for prediction.
The correlation of the historical data (i.e., the basic data) of the charging loads of the electric vehicles is analyzed by utilizing the correlation coefficients, and top five groups of data with the biggest correlation coefficients are selected as load data used for prediction; and according to the sequence of the correlation coefficients from small to big, the top five groups of data with the biggest correlation coefficients, i.e., the five groups of data with the highest correlation, is selected as the data of the charging loads of the electric vehicles, which is used for prediction.
Step 4: predicting the data of the charging loads of the electric vehicles, which is used for prediction and is selected in the Step 3, by adopting an LSTM algorithm, to obtain prediction results.
The Step 4 specifically comprises the following steps:
inputting the data Xt of the charging loads of the electric vehicles, which is used for prediction and is selected in the Step 3, and carrying out processing of a forgetting stage of a forgetting gate on load data Xt of each time point firstly, wherein the calculation formula is shown as follows:
f
t=σ(Wf·[ht-1,xt]+bf)
then, carrying out processing of a cell state updating stage of an input gate on a result ft obtained by processing of the forgetting stage of the forgetting gate, wherein the calculation formula is shown as follows:
C
t
=f
t
*C
t-1
+i
t
*{tilde over (C)}t
finally, carrying out processing of an output stage of an output gate on Ct, wherein the calculation formulas are shown as follows:
0t=σ(Wo·[ht-1,xt]+bo)
h
t=0t*tan h(Ct)
taking load data obtained after the load data of one time point is processed by the three gate stages as legacy information ht-1 of a previous cell, and enabling the legacy information ht-1 and load data of a new time point to participate in recursive processing of the three gate stages again, to obtain load prediction values ht of 96 time points in one day finally.
In the embodiment, LSTM has the structure which is generally consistent with an RNN (Recurrent Neural Network), but duplicate modules have different structures. The LSTM has four network layers which are different from a single neural network layer of the RNN, and the four network layers are interacted with one another in a very special manner. Through the manner, previous information which is distorted easily is screened and integrated into new information, and the new information is reserved; the reserved new information and new information entering at the same time are superposed at a certain proportion; and finally, the superposed information is output by a tan h function. In addition, an LSTM network can be used for capturing long time slice dependency and deciding that which information needs to be reserved, and which information needs to be forgotten.
The present invention is further described below by a specific example:
Step 1: collecting research objects, wherein in the example, data of charging loads of that very day and days (D-1)-(D-10) at a certain area is collected as basic data for correlation processing, and the details are shown in Tab. 1;
Step 2: carrying out data correlation analysis on the basic data and calculating correlation of data of each day and real data of that very day by utilizing Excel software, so as to obtain correlation coefficients between the basic data,
wherein the calculation formula of the correlation coefficient is shown as (1):
wherein rxiy represents a correlation coefficient of an ith group of samples; Sxiy represents the covariance of data of the day D-i and the data of that very day; Sxi represents the sample standard deviation of xi, i.e. the ten typical days (D-1)-(D-10); Sxi represents the sample standard deviation of a dependent variable y, i.e. the data of that very day; and according to the formula, the sample standard deviations of the ten days (D-1)-(D-10) and the sample standard deviation of the real data of that very day need to be calculated firstly, and then, the covariance between the data of the days (D-1)-(D-10) and the data of that very day is calculated, to obtain the correlation coefficient between predicted data according to the formula (1);
front 200 pieces of data in the collected data is calculated, to obtain the sample standard deviations of the ten days (D-1)-(D-10) and the sample standard deviation of the real data of that very day, which are respectively shown as follows:
Sx1=15518.7702, Sx2=15306.236, Sx3=15234.1388,
Sx4=15170.64539, Sx5=15365.59057, Sx6=15411.0932,
Sx7=15365.21298, Sx8=15183.83278, Sx9=15254.04272,
Sx10=15335.72268, Sy=15563.67394.
the covariance between the data of the days (D-1)-(D-10) and the data of that very day, which is shown as follows:
Sx1y=230556230.1, Sx2y=226709123.7, Sx37=224826730.8,
Sx4y=225406997.5, Sx5y=230894694.9, Sx67=234740896.6,
Sx7y=234712143.6, Sx8y=229462672.7, Sx9y=231249625.3,
Sx10y=233008103.1.
the correlation coefficients between the data of the days (D-1)-(D-10) and the data of that very day can be obtained through calculation according to the calculation formula of the correlation coefficients, which are respectively shown as follows:
rx1y=0.9546, rx2y=0.9517, rx3y=0.9482, rx4y=0.9547, rx5y=0.9655,
rx6y=0.9787, rx7y=0.9815, rx8y=0.9715, rx8y=0.9741, rx10y=0.9762
(Four decimals are reserved through rounding.);
The standard deviation refers to respective standard deviation of the data of the selected ten typical days, and the covariance is obtained by calculating the data of each of the ten typical days and the data of that very day; and the verified content is the correlation degree of the selected ten typical days and that very day.
Step 3: analyzing the correlation of the basic data by utilizing the correlation coefficients and selecting load data used for prediction.
The sequence of the correlation of the data of the days (D-1)-(D-10) and the data of that very day can be obtained according to the data in the Step 2, which is shown as follows: Sx7y>Sx6y>Sx10y>Sx9y>Sx8y>Sx5y>Sx4y>Sx1y>Sx2y>Sx3y.
Five days with the highest correlation with the data of that very day are a day D-7, a day D-6, a day D-10, a day D-9 and a day D-8, and therefore, the data of the five days are selected as the load data used for prediction;
Step 4: predicting the selected load data by adopting an LSTM algorithm, to obtain prediction results.
LSTM is a long short term memory network, which is a time RNN and is suitable for processing and predicting an important event with a relatively longer interval and a relatively longer delay in a time sequence.
LSTM and the RNN have the main difference that a ‘processor’ for judging that whether information is useful or not is added into the algorithm in the LSTM, and a functional structure of the processor is called a cell.
Three gates are placed in one cell, which are an input gate, a forgetting gate and an output gate; one piece of information enters the LSTM network and can be judged to be useful or not according to a rule; and only information in conformity with the algorithm is reserved, and information which is not in conformity with the algorithm is forgotten by the forgetting gate.
A process of processing the information in the cell is shown as follows:
A first stage: a forgetting stage of the forgetting gate, wherein the stage is mainly used for selectively forgetting input transmitted by a last node; simply, the stage is used for ‘forgetting unimportant information and remembering important information’; specifically, the decision is made by an S-shaped network layer of a so-called ‘forgetting gate layer’; the cell is used for receiving legacy information ht-1 of a last cell and external information xt, and for each number in a cell state Ct-1, the output value is between 0 and 1; 1 represents ‘completely accepting the information’, and 0 represents ‘completely neglecting the information’; and a forgetting formula is shown as (2):
f
t=σ(Wf·[ht-1,xt]+bf) (2)
wherein ft represents data information after being processed by the forgetting gate; Wf represents a weight matrix; bf represents an offset vector corresponding to the forgetting gate; ht-1 represents the legacy information of the last cell; xt represents input external data information; and σ represents carrying out forgetting processing of the forgetting gate on the data.
A second stage: a cell state updating stage of the input gate, wherein the stage is used for selectively ‘remembering’ input in the stage, comprising two parts: a first part is that an S-shaped network layer of a so-called ‘input gate layer’ is used for determining that which information needs to be updated, and a second part is that a tan h-shaped network layer is used for establishing a new alternative value vector {tilde over (C)}t, which can be added into the cell state; the above two parts are combined in the next step, so as to update the state;
Results obtained in the above two steps are added, so as to obtain Ct after state updating, and a cell state updating formula is shown as (3):
C
t
=f
t
*C
t-1
+i
t
*{tilde over (C)}t (3)
wherein Ct represents a cell state after being updated; ft represents data information after being processed by the forgetting gate; Ct-1 represents a state before the cell is updated; {tilde over (C)}t represents the new alternative value vector established by the tan h-shaped network layer; and it represents an established parameter calculated by the input gate.
A third stage: an output stage of the output gate, wherein the stage is used for deciding that which information is regarded as output of a current state; firstly, the S-shaped network layer is operated, which is used for determining that which parts in the cell state can be output: then, the cell state is input into tan h (the numerical value is adjusted between −1 and 1.) and then is multiplied by the output value of the S-shaped network layer, so that the parts which a user wants to output can be output; and output formulas are shown as (4) and (5):
0t=σ(Wo·[ht-1,xt]+bo) (4)
h
t=0t*tan h(Ct) (5)
The meanings of symbols are the same as the meanings of the above symbols.
LSTM prediction is carried out on the data by adopting MATLAB (Matrix Laboratory) software, and prediction results are shown in Tab. 2; a diagram of the prediction results is shown in
Step 5: analyzing the prediction results by adopting an error analysis method and evaluating the accuracy of the prediction method.
The results are explained by adopting the error analysis method based on the prediction results; and an error calculation formula is shown as (6):
C
t=(Qct−Qyt)/Qct (6)
wherein Ct represents the error percentage at a moment t; Qct represents the actual value at the moment t; Qyt represents the prediction value at the moment t; and the error analysis method can be used for effectively evaluating the prediction accuracy and proving the prediction accuracy.
A diagram of error prediction percentage results is shown in
It should be emphasized that the embodiments of the present invention are illustrative, rather than restrictive. Therefore, the present invention includes but not limited to the embodiments in detailed description. All other implementation manners obtained by those skilled in the art according to the technical solutions of the present invention belong to the protection scope of the present invention.
Number | Date | Country | Kind |
---|---|---|---|
202110978765.2 | Aug 2021 | CN | national |