CHANCE CONSTRAINED EXTREME LEARNING MACHINE METHOD FOR NONPARAMETRIC INTERVAL FORECASTING OF WIND POWER

Information

  • Patent Application
  • 20220209532
  • Publication Number
    20220209532
  • Date Filed
    March 15, 2022
    2 years ago
  • Date Published
    June 30, 2022
    2 years ago
Abstract
The present application discloses a chance constrained extreme learning machine method for nonparametric interval forecasting of wind power, which belongs to the field of renewable energy generation forecasting. The method combines an extreme learning machine with a chance constrained optimization model, ensures that the interval coverage probability is no less than the confidence level by chance constraint, and takes minimizing the interval width as the training objective. The method avoids relying on the probability distribution hypothesis or limiting the interval boundary quantile level, so as to directly construct prediction intervals with well reliability and sharpness. The present application also proposes a bisection search algorithm based on difference of convex functions optimization to achieve efficient training for the chance constrained extreme learning machine.
Description
TECHNICAL FIELD

The present application relates to a chance constrained extreme learning machine method for nonparametric interval forecasting of wind power, and belongs to the field of renewable energy power generation prediction.


BACKGROUND

At present, wind energy has become one of the main sources of renewable energy power generation due to its advantages such as wide distribution of resources, mature development technology and low investment cost. However, the chaotic nature of the atmospheric system leads to significant intermittency and uncertainty in wind power, posing a great challenge to the secure operation of the power system with a large share of wind power integration. High-precision wind power prediction provides key information support for power system planning and construction, operational control, market transactions, etc., and is one of the important means to help power system effectively cope with wind power uncertainty.


Traditional wind power prediction focuses on deterministic point prediction with a single point expected value as an output, so it is difficult to avoid prediction errors. Probabilistic forecasting effectively quantifies the uncertainty of wind power prediction by prediction interval, predictive quantile and predictive probability distribution, and provides more abundant and comprehensive information for decision makers, and thus has become research frontier in the field of wind power prediction. The prediction interval contains the future wind power with a certain confidence level, with clear probability interpretation and concise mathematical form, and thus is widely used in economic dispatch, optimal power flow, risk assessment and stability analysis of the power system. However, the existing methods need to prescribe the quantile proportions corresponding to the interval bounds in advance when deriving the wind power prediction interval. The common practice is to limit the interval bounds to be symmetrical about the median of the wind power to be predicted in the sense of probability. The limitation of asymmetric wind power probability distribution will lead to a conservative interval width and increase the potential operation cost for the power system to cope with the uncertainty of wind power. Therefore, it is necessary to invent a wind power interval prediction method with better flexibility, which can adaptively determine the quantile proportions of the interval bounds, and minimize the prediction interval width on the premise of meeting the nominal confidence level.


SUMMARY

In view of the limitations of existing wind power interval prediction methods, the present application provides a wind power nonparametric interval prediction method based on a chance constrained extreme learning machine. This method does not depend on the parametric hypothesis of wind power probability distribution, does not need to specify the quantile proportions corresponding to the interval bounds in advance, but directly generates the wind power prediction interval meeting the confidence level with the goal of minimizing the interval width, and thus can adapt to symmetric or asymmetric wind power probability distribution under time-varying conditions. It is also suitable for interval prediction of other renewable energy generation power and load, and has good flexibility and adaptability


In order to achieve the above purpose, the present application adopts the following technical solution:


A wind power nonparametric interval prediction method based on a chance constrained extreme learning machine, comprising the following steps of:


(1) constructing a chance constrained extreme learning machine model


using the extreme learning machine as a regression function of upper and lower bounds of the wind power prediction interval, comprehensively considering joint probability distribution of a wind power and an input feature thereof, limiting the wind power to fall into the prediction intervals with a probability not lower than a nominal confidence level by using chance constraint, and taking an expectation of minimizing an interval width as a training objective, and constructing the chance constrained extreme learning machine model:







min


Ο‰
β„“

,

Ο‰
u



⁒


𝔼
ΞΌ

⁑

[


f
⁑

(

x
,

Ο‰
u


)


-

f
⁑

(

x
,

Ο‰
β„“


)



]






which is subject to:








β„™
ΞΌ

⁑

[


f
⁑

(

x
,

Ο‰
β„“


)


≀
y
≀

f
⁑

(

x
,

Ο‰
u


)



]


β‰₯

100
⁒

(

1
-
Ξ²

)

⁒



⁒
%







0
≀

f
⁑

(

x
,

Ο‰
β„“


)


≀

f
⁑

(

x
,

Ο‰
u


)


≀
1




where x is a random variable corresponding to the input feature, y is a random variable corresponding to normalized wind power, a joint probability distribution of the two is denoted as ΞΌ(x, y); f(x, Ο‰l) and f(x, Ο‰u) are output equations of the extreme learning machine, which represent lower and upper boundaries of the prediction interval, respectively, Ο‰l and Ο‰u are weight vectors from a hidden layer of the extreme learning machine to output neurons; 100(1βˆ’Ξ²)% is the nominal confidence level of the prediction interval; custom-characterand custom-characterdenote expectation and probability operators respectively;


(2) constructing a sample average approximate model of the chance constrained extreme learning machine


replacing the joint probability distribution of the input feature and wind power with an empirical probability distribution of training set samples thereof, approximating an actual expectation in an objective function by empirical expectation, and approximating an actual probability in the chance constraint by empirical probability to obtain the sample average approximate model of the chance constrained extreme learning machine;







v
β˜…

=


min


Ξ³
t

,

Ο‰
β„“

,

Ο‰
u



⁒


Ξ£

t
∈
𝒯


⁑

[


f
⁑

(


x
t

,

Ο‰
u


)


-

f
⁑

(


x
t

,

Ο‰
β„“


)



]







which is subject to:








Ξ³
t

=

max
⁒

{



f
⁑

(


x
t

,

Ο‰
β„“


)


-

y
t


,


y
t

-

f
⁑

(


x
t

,

Ο‰
u


)




}



,

βˆ€

t
∈
𝒯










Ξ£

t
∈
𝒯


⁑

[

1
-

⁒

(


Ξ³
t

≀
0

)



]


≀

Ξ²
⁒

ο˜ƒ
𝒯
ο˜„









0
≀

f
⁑

(


x
t

,

Ο‰
β„“


)


≀

f
⁑

(


x
t

,

Ο‰
u


)


≀
1

,

βˆ€

t
∈
𝒯






where v* is an optimal value of the optimization model, which donates the shortest overall width of prediction intervals satisfying the chance constraint; xt and yt are an input feature and the wind power; custom-character is a subscript set of various samples of the training set {(xt, yt)custom-character, |custom-character| is a number of the samples of the training set, Ξ²|custom-character| is a maximum number of wind power samples outside the prediction interval at the nominal confidence level; Ξ³t is an auxiliary variable indicating whether the wind power falls into the prediction interval, with a non-negative value indicating that the wind power falls into the corresponding prediction interval, or a positive value indicating that the wind power does not fall into the corresponding prediction interval; max{β‹…} is a maximum function, taking a maximum value of each variable thereof; custom-character(β‹…) is an indicator function of a logical discriminant, with a value being 1 when the logical discriminant is true and 0 when the logical discriminant is false;


(3) constructing a parametric 0-1 loss minimization model;


introducing a virtual parametric variable to represent an overall width budget of the prediction interval, minimizing the probability that the wind power does not fall into the prediction interval on a premise of meeting the width budget to obtain the parametric 0-1 loss minimization model;







ρ
⁑

(
v
)


=


min


Ξ³
t

,

Ο‰
β„“

,

Ο‰
u



⁒


Ξ£

t
∈
𝒯


⁑

[

1
-

⁒

(


Ξ³
t

≀
0

)



]







which is subject to:








Ξ³
t

β‰₯


f
⁑

(


x
t

,

Ο‰
β„“


)


-

y
t



,

βˆ€

t
∈
𝒯










Ξ³
t

β‰₯


y
t

-

f
⁑

(


x
t

,

Ο‰
u


)




,

βˆ€

t
∈
𝒯











βˆ‘

t
∈
𝒯


⁒

[


f
⁑

(


x
t

,

Ο‰
u


)


-

f
⁑

(


x
t

,

Ο‰
β„“


)



]


≀

v
⁒



⁒
0

≀

f
⁑

(


x
t

,

Ο‰
β„“


)


≀

f
⁑

(


x
t

,

Ο‰
u


)


≀
1

,

βˆ€

t
∈
𝒯






where v is an introduced parameter representing the overall width budget of the prediction interval; ρ(v) is an optimal value function of the parametric 0-1 loss minimization model with the parameter v, and the smallest parameter that satisfies the condition ρ(v)≀β|custom-character| is the shortest overall width v* of the prediction intervals that satisfy the chance constraint;


(4) constructing a parametric difference of convex functions optimization model


approximating the indicator function in the objective function of the parametric 0-1 loss minimization model by a difference of convex function to obtain the parametric difference of convex functions optimization model:








ρ
_

⁑

(
v
)


=


min


Ξ³
t

,

Ο‰
β„“

,

Ο‰
u



⁒


βˆ‘

t
∈
𝒯


⁒

[

1
-


L

D
⁒
C


⁑

(


Ξ³
t

;
m

)



]







which is subject to:









L
DC

⁑

(


Ξ³
t

;
m

)


=


max
⁒

{



-
m

⁒



⁒

Ξ³
t


,
0

}


-

max
⁒

{




-
m

⁒



⁒

Ξ³
t


-
1

,
0

}




,

βˆ€

t
∈
𝒯










Ξ³
t

β‰₯


f
⁑

(


x
t

,

Ο‰
β„“


)


-

y
t



,

βˆ€

t
∈
𝒯










Ξ³
t

β‰₯


y
t

-

f
⁑

(


x
t

,

Ο‰
u


)




,

βˆ€

t
∈
𝒯










Ξ£

t
∈
𝒯


⁑

[


f
⁑

(


x
t

,

Ο‰
u


)


-

f
⁑

(


x
t

,

Ο‰
β„“


)



]


≀
v







0
≀

f
⁑

(


x
t

,

Ο‰
β„“


)


≀

f
⁑

(


x
t

,

Ο‰
u


)


≀
1

,

βˆ€

t
∈
𝒯






where ρ(v) is the optimal value function of the parametric difference of convex functions optimization model with the parameter v; LDC(Ξ³t; m) is a difference of convex functions approximating the indicator function custom-character(Ξ³t≀0); m is a positive slope parameter of the difference of convex functions, and the greater value of m indicates higher similarity between the difference of convex functions LDC(Ξ³t; m) and the indicator function custom-character(Ξ³t≀0); the decision vector of the model is denoted as ΞΈ=[custom-character]Ο„, where Ξ³=[Ξ³1 Ξ³2 β‹… β‹… β‹… custom-character]Ο„;


(5) adopting the difference of convex functions optimization based bisection search algorithm to train the extreme learning machine;


using the difference of convex functions optimization based bisection search algorithm to search for the shortest overall width v* of prediction intervals that satisfy the chance constraint, so as to realize the training of the extreme learning machine; and specifically comprising the following steps:


step (1), giving a bisection search algorithm precision ∈1 and a bisection search interval [custom-character], wherein the given bisection search interval should contain the shortest overall width v* of the prediction intervals;


Step (2): for the parametric difference convex functions optimization, setting the parameter v thereof as a midpoint (custom-character)/2 of the bisection search interval, and solving the difference of convex functions optimization model by using a convex-concave procedure algorithm:


step (2.1): giving an algorithm convergence accuracy ∈2, the slope parameter m of the difference of convex functions, and the parameter v representing the overall width budget of the prediction intervals;


step (2.2): setting an iteration counter k←0; solving the following linear programming problem to obtain an initial solution ΞΈ(0) of the model:







ΞΈ

(
0
)


←



arg
⁒
min



Ξ³
t

,

Ο‰
β„“

,

Ο‰
u



⁒

1
T

⁒
Ξ³





which is subject to:








Ξ³
t

β‰₯
0

,

βˆ€

t
∈
𝒯










Ξ³
t

β‰₯


f
⁑

(


x
t

,

Ο‰
β„“


)


-

y
t



,

βˆ€

t
∈
𝒯










Ξ³
t

β‰₯


y
t

-

f
⁑

(


x
t

,

Ο‰
u


)




,

βˆ€

t
∈
𝒯











βˆ‘

t
∈
𝒯


⁒

[


f
⁑

(


x
t

,

Ο‰
u


)


-

f
⁑

(


x
t

,

Ο‰
β„“


)



]


≀

v
⁒



⁒
0

≀

f
⁑

(


x
t

,

Ο‰
β„“


)


≀

f
⁑

(


x
t

,

Ο‰
u


)


≀
1

,

βˆ€

t
∈
𝒯






where 1 is a vector with all elements being 1, whose dimension is the same as the number of the samples in the training set;


step (2.3): updating the solution of the parametric difference of convex functions optimization model in a (k+1)th iteration by using the following formula:







ΞΈ

(

k
+
1

)


←




arg
⁒
min



Ξ³
t

,

Ο‰
β„“

,

Ο‰
u



⁒


L
vex
+

⁑

(
Ξ³
)



-

[



L

v
⁒
e
⁒
x

-

⁑

(

Ξ³

(
k
)


)


+


Ξ΄


(
k
)

T


⁑

(

Ξ³
-

Ξ³

(
k
)



)



]






which subject to:








L

v
⁒
e
⁒
x

+

⁑

(
Ξ³
)


=


βˆ‘

t
∈
𝒯


⁒

[

1
+

max
⁒

{




-
m

⁒

Ξ³
t


-
1

,
0

}



]










L
vex
-

⁑

(
Ξ³
)


=


βˆ‘

t
∈
𝒯


⁒

max
⁒

{




-
m

⁒

Ξ³
t


-
1

,
0

}











Ξ³
t

β‰₯


f
⁑

(


x
t

,

Ο‰
β„“


)


-

y
t



,

βˆ€

t
∈
𝒯










Ξ³
t

β‰₯


y
t

-

f
⁑

(


x
t

,

Ο‰
u


)




,

βˆ€

t
∈
𝒯










βˆ‘

t
∈
𝒯


⁒

[


f
⁑

(


x
t

,

Ο‰
u


)


-

f
⁑

(


x
t

,

Ο‰
β„“


)



]


≀
v







0
≀

f
⁒

(


x
t

,

Ο‰
β„“


)


≀

f
⁑

(


x
t

,

Ο‰
u


)


≀
1

,

βˆ€

t
∈
𝒯






where Lvex+(Ξ³) and Lvexβˆ’(Ξ³) are both convex functions, which constitute a minuend and a subtrahend of the difference of convex function LDC(Ξ³t; m); Ξ΄(k) is a subdifferential of the convex function Lvexβˆ’(Ξ³) at Ξ³(k), satisfying








Ξ΄

(
k
)


∈

{



g
∈

ℝ

ο˜ƒ
𝒯
ο˜„



❘



L
vex
-

⁑

(
Ξ³
)


β‰₯



L
vex
-

⁑

(

Ξ³

(
k
)


)


+


g
T

⁑

(

Ξ³
-

Ξ³

(
k
)



)





,

βˆ€
Ξ³


}


=

{




[


g
1

⁒



⁒

g
2

⁒



⁒
β‹―
⁒



⁒

g

ο˜ƒ
𝒯
ο˜„



]

T

❘

βˆ€

t
∈
𝒯



,

{





g
t

=

-
m






if
⁒



⁒

Ξ³
t


≀
0







g
t

∈

[


-
m

,
0

]






if
⁒



⁒

Ξ³
t


=
0







g
t

=
0





if
⁒



⁒

Ξ³
t


>
0




}







where g ∈ custom-character indicates that g is a real column vector with dimensionality equal to the number of samples |custom-character| in the training set;


step (2.4): the iteration counter self-increasing k←k+1; calculating a convergence error e←θ(k)βˆ’ΞΈ(kβˆ’1); and


step (2.5), checking whether a Euclidean norm βˆ₯eβˆ₯2 of the convergence error meets the convergence accuracy ∈2, and if not, returning to the step (3), otherwise outputting the converged solution θ←θ(k).


step (3): calculating the number miss of samples of which the wind power in the training set falls outside the prediction interval






miss
←


Ξ£

t
∈
𝒯


⁑

[

1
-

𝕀
⁑

(


f
⁑

(


x
t

,


Ο‰
_

β„“


)


≀

y
t

≀

f
⁑

(


x
t

,


Ο‰
_

u


)



)



]






step (4), if miss≀β|custom-character|, updating the upper boundary vu←(custom-charactern+vu)/2 of the bisection search interval and recording weight vectors custom-character←custom-character, Ο‰u←ωu of the output layer of the current extreme learning machine; otherwise, updating the lower boundary custom-character←(custom-character+vu)/2 of the bisection search interval;


step (5): if vuβˆ’custom-characterβ‰€βˆˆ1, outputting the weight vectors custom-character and Ο‰u of the output layer of the extreme learning machine, otherwise, returning to step (2).


The method has the following beneficial effects:


Aiming at the problem of nonparametric interval forecasts of wind power, a chance constrained extreme learning machine model is proposed. The model uses chance constraint to ensure that the confidence level of the prediction intervals meet the reliability requirements, and minimizes the interval width, avoiding the parametric hypothesis of wind power probability distribution and fixed quantile proportions of traditional prediction intervals, thereby realizing the self-adaptive direct construction of wind power prediction intervals; based on the training data, a sample average approximation model of the chance constrained extreme learning machine is established, and the sample average approximation model is transformed into a 0-1 loss minimization model and a parametric difference of convex functions optimization model, and the training of the extreme learning machine is transformed into searching the shortest overall interval width meeting the chance constraint; a bisection search algorithm based on difference of convex functions optimization is proposed to realize efficient training of the chance constrained extreme learning machine. The wind power prediction intervals obtained by the method have shorter interval width on the premise of ensuring a reliable confidence level, and provides more accurate uncertainty quantitative information for power system decision-making. In addition to wind power, the method of the present application is also applicable to interval forecasts of other renewable energy sources and loads, and thus has wide applicability.





BRIEF DESCRIPTION OF DRAWINGS


FIG. 1 is a flowchart of the chance constrained extreme learning machine based nonparametric interval forecast of the present application;



FIG. 2 displays the wind power prediction intervals of a summer data set obtained by the method of the present application.





DESCRIPTION OF EMBODIMENTS

The present application will be further explained with reference to the drawings and examples.


(1) a training data set custom-character={(xt, yt)custom-character and a test data set custom-character={(xt, yt)}t∈v are constructed, wherein xt is an input feature, yt is a wind power value to be predicted, and custom-character and V subscript sets of the samples in the training set and test set respectively; the number of the hidden-layer neurons of the extreme learning machine is determined; the weight vector of the input layer and the bias of the hidden layer of extreme learning machine are randomly initialized; the nominal confidence level of the prediction interval 100(1βˆ’Ξ²)% is determined.


(2) A sample average approximation model of the chance constrained extreme learning machine is constructed







min


Ξ³
t

,

Ο‰
β„“

,

Ο‰
u



⁒


βˆ‘

t
∈
𝒯


⁒

[


f
⁑

(


x
t

,

Ο‰
u


)


-

f
⁑

(


x
t

,

Ο‰
β„“


)



]






which is subject to:








Ξ³
t

=

max
⁒

{



f
⁑

(


x
t

,

Ο‰
β„“


)


-

y
t


,


y
t

-

f
⁑

(


x
t

,

Ο‰
u


)




}



,

βˆ€

t
∈
𝒯










βˆ‘

t
∈
𝒯


⁒

[

1
-

𝕀
⁑

(


Ξ³
t

≀
0

)



]


≀

Ξ²
⁒

ο˜ƒ
𝒯
ο˜„









0
≀

f
⁑

(


x
t

,

Ο‰
β„“


)


≀

f
⁑

(


x
t

,

Ο‰
u


)


≀
1

,

βˆ€

t
∈
𝒯






where f(x,custom-character) and f(x, Ο‰u) are output equations of the extreme learning machine, which represent upper and lower boundaries of the prediction interval respectively, custom-character and Ο‰u are weight vectors from the hidden layer of the extreme learning machine to an output neuron; Ξ³t is the auxiliary variable indicating whether the wind power falls into the prediction interval, a non-negative value indicating that the wind power falls into the corresponding prediction interval, and a positive value indicating that the wind power does not fall into the corresponding prediction interval; max{β‹…} is the maximum function, taking the maximum value of each variable thereof; custom-character(β‹…) is the indicator function of a logical discriminant, taking 1 when the logical discriminant is true and 0 when the logical discriminant is false.


(3) The following parametric difference of convex functions optimization model is constructed:








ρ
_

⁑

(
v
)


=


min


Ξ³
t

,

Ο‰
β„“

,

Ο‰
u



⁒


Ξ£

t
∈
𝒯


⁑

[

1
-


L
DC

⁑

(


Ξ³
t

;
m

)



]







which is subject to:









L

D
⁒
C


⁑

(


Ξ³
t

;
m

)


=


max
⁒

{



-
m

⁒

Ξ³
t


,
0

}


-

max
⁒

{




-
m

⁒

Ξ³
t


-
1

,
0

}




,

βˆ€

t
∈
𝒯










Ξ³
t

β‰₯


f
⁑

(


x
t

,

Ο‰
β„“


)


-

y
t



,

βˆ€

t
∈
𝒯










Ξ³
t

β‰₯


y
t

-

f
⁑

(


x
t

,

Ο‰
u


)




,

βˆ€

t
∈
𝒯










βˆ‘

t
∈
𝒯


⁒

[


f
⁑

(


x
t

,

Ο‰
u


)


-

f
⁑

(


x
t

,

Ο‰
β„“


)



]


≀
v







0
≀

f
⁑

(


x
t

,

Ο‰
β„“


)


≀

f
⁑

(


x
t

,

Ο‰
u


)


≀
1

,

βˆ€

t
∈
𝒯






where v is the introduced parameter representing the overall width budget of the prediction intervals; ρ(v) is an optimal value function of the difference convex optimization model with respect to the parametric variable v; LDC(Ξ³t; m) is a difference of convex functions that can be decomposed into the difference of two convex functions, and m is the slope parameter of the difference of convex functions; f(x,custom-character) and f(x, Ο‰u) are the output equations of the extreme learning machine; the decision vector of the model is denoted as ΞΈ=[Ξ³Ο„custom-characterΟ‰uΟ„]Ο„, where Ξ³=[Ξ³1 Ξ³2 β‹… β‹… β‹… custom-character]Ο„.


(4) For the parameter v in the difference convex optimization model, the bisection search algorithm is used to search for the shortest overall width v* of the prediction intervals that satisfy the chance constraint, so as to realize the training of the extreme learning machine; the algorithm specifically includes the following steps:


step (1), giving a bisection search algorithm precision ∈1 and a bisection search interval [custom-character], wherein the given bisection search interval should contain the shortest overall width v* of the prediction intervals;


Step (2): for the parametric difference of convex functions optimization, setting the parameter v thereof as the midpoint (custom-character+vu)/2 of the bisection search interval, and solving the difference of convex functions optimization model by using a convex-concave procedure algorithm as described in steps (2.1)-(2.5):


step (2.1): giving the algorithm convergence accuracy ∈2, the slope parameter m of the difference of convex functions, and the parameter v representing the overall width budget of the prediction intervals;


step (2.2): setting the iteration counter k←0; solving the following linear programming problem to obtain an initial solution ΞΈ(0) of the model:







ΞΈ

(
0
)


←


argmin


Ξ³
t

,

Ο‰
β„“

,

Ο‰
u



⁒

1
T

⁒
Ξ³





which is subject to:








Ξ³
t

β‰₯
0

,

βˆ€

t
∈
𝒯










Ξ³
t

β‰₯


f
⁑

(


x
t

,

Ο‰
β„“


)


-

y
t



,

βˆ€

t
∈
𝒯










Ξ³
t

β‰₯


y
t

-

f
⁑

(


x
t

,

Ο‰
u


)




,

βˆ€

t
∈
𝒯










βˆ‘

t
∈
𝒯


⁒

[


f
⁑

(


x
t

,

Ο‰
u


)


-

f
⁑

(


x
t

,

Ο‰
β„“


)



]


≀
v







0
≀

f
⁑

(


x
t

,

Ο‰
β„“


)


≀

f
⁑

(


x
t

,

Ο‰
u


)


≀
1

,

βˆ€

t
∈
𝒯






where 1 is a vector with all elements being 1, whose dimension is the same as the number of the samples in the training set;


step (2.3): updating the solution of the parametric difference of convex functions optimization model in the (k+1)th iteration by the following formula:







ΞΈ

(

k
+
1

)


←



argmin


Ξ³
t

,

Ο‰
β„“

,

Ο‰
u



⁒


L
vex
+

⁑

(
Ξ³
)



-

[



L

v
⁒
e
⁒
x

-

⁑

(

Ξ³

(
k
)


)


+


Ξ΄


(
k
)

T


⁑

(

Ξ³
-

Ξ³

(
k
)



)



]






which is subject to:








L

v
⁒
e
⁒
x

+

⁑

(
Ξ³
)


=


βˆ‘

t
∈
𝒯


⁒

[

1
+

max
⁒

{




-
m

⁒

Ξ³
t


-
1

,
0

}



]










L
vex
-

⁑

(
Ξ³
)


=


βˆ‘

t
∈
𝒯


⁒

max
⁒

{




-
m

⁒

Ξ³
t


-
1

,
0

}











Ξ³
t

β‰₯


f
⁑

(


x
t

,

Ο‰
β„“


)


-

y
t



,

βˆ€

t
∈
𝒯










Ξ³
t

β‰₯


y
t

-

f
⁑

(


x
t

,

Ο‰
u


)




,

βˆ€

t
∈
𝒯










βˆ‘

t
∈
𝒯


⁒

[


f
⁑

(


x
t

,

Ο‰
u


)


-

f
⁑

(


x
t

,

Ο‰
β„“


)



]


≀
v







0
≀

f
⁒

(


x
t

,

Ο‰
β„“


)


≀

f
⁑

(


x
t

,

Ο‰
u


)


≀
1

,

βˆ€

t
∈
𝒯






where Lvex+(Ξ³) and Lvexβˆ’(Ξ³) are both convex functions, which constitute a minuend and a subtrahend of the difference of convex functions LDC(Ξ³t; m) respectively; Ξ΄(k) is a subdifferential of the convex function Lvexβˆ’(Ξ³) at Ξ³(k), satisfying








Ξ΄

(
k
)


∈

{



g
∈

ℝ

ο˜ƒ
𝒯
ο˜„



❘



L
vex
-

⁑

(
Ξ³
)


β‰₯



L
vex
-

⁑

(

Ξ³

(
k
)


)


+


g
T

⁑

(

Ξ³
-

Ξ³

(
k
)



)





,

βˆ€
Ξ³


}


=

{




[


g
1

⁒



⁒

g
2

⁒



⁒
β‹―
⁒



⁒

g

ο˜ƒ
𝒯
ο˜„



]

T

❘

βˆ€

t
∈
𝒯



,

{





g
t

=

-
m






if
⁒



⁒

Ξ³
t


≀
0







g
t

∈

[


-
m

,
0

]






if
⁒



⁒

Ξ³
t


=
0







g
t

=
0





if
⁒



⁒

Ξ³
t


>
0




}







where g ∈custom-character is a real column vector with dimensionality equal to the number of samples |custom-character| in the training set;


step (2.4): the iteration counter self-increasing k←k+1; calculating a convergence error e←θ(k)βˆ’ΞΈ(kβˆ’1); and


step (2.5), checking whether a Euclidean norm βˆ₯eβˆ₯2 of the convergence error meets the convergence accuracy ∈2, and if not, returning to the step (3), otherwise outputting the converged solution θ←θ(k).


step (3): calculating the number miss of samples of which the wind power in the training set falls outside the prediction interval






miss
←


Ξ£

t
∈
𝒯


⁑

[

1
-

𝕀
⁑

(


f
⁑

(


x
t

,


Ο‰
_

β„“


)


≀

y
t

≀

f
⁑

(


x
t

,


Ο‰
_

u


)



)



]






step (4), if miss≀β|custom-character|, updating the upper boundary vu←(custom-character+vu)/2 of the bisection search interval and recording weight vectors custom-character←custom-character, Ο‰u←ωu of an output layer of the current extreme learning machine; otherwise, updating the lower boundary custom-character←(custom-character+vu)/2 of the bisection search interval;


step (5): if vuβˆ’custom-characterβ‰€βˆˆ1, outputting the weight vectors custom-character and Ο‰u of the output layer of the extreme learning machine, otherwise, returning to step (2).


(5) The trained extreme learning machine is used to construct the prediction intervals {[f (xt, custom-character), f(xt, Ο‰u])}t∈v of the test set custom-character={(xt, yt)}t∈v, and the average coverage deviation (ACD) is used to evaluate the reliability of the prediction intervals, which is defined as the deviation between the empirical coverage probability (ECP) and the nominal confidence level 100(1βˆ’Ξ²)%:








ACD
⁒



⁒

:=

⁒



⁒
ECP

-

(

1
-
Ξ²

)


=



1

ο˜ƒ
𝒱
ο˜„


⁒

Ξ£

t
∈
𝒱


⁒

𝕀
⁑

(


f
⁑

(


x
t

,

Ο‰
β„“


)


≀

y
t

≀

f
⁑

(


x
t

,

Ο‰
u


)



)



-

(

1
-
Ξ²

)






where, |V| is the number of the samples in the test set, the smaller the absolute value of the average coverage deviation, the better the reliability of the prediction interval;


The average width (AW) of the interval is used to evaluate the sharpness of the prediction interval, which is defined as






AW
⁒



⁒

:=

⁒



⁒

1

ο˜ƒ
𝒱
ο˜„


⁒


Ξ£

t
∈
𝒩


⁑

(


f
⁑

(


x
t

,

Ο‰
u


)


-

f
⁑

(


x
t

,

Ο‰
β„“


)



)






On the premise of well reliability of the prediction intervals, the smaller the average width of the prediction intervals, the higher the sharpness performance of the prediction intervals.


The above process is shown in FIG. 1.


The effectiveness of the proposed method is verified by the actual wind power data from the Glens of Foudland Wind Farm in Scotland in 2017, and the time resolution of the data is 30 minutes. Considering the differences of seasonal characteristics, the wind power prediction model for each season is independently trained and verified, in which the training samples account for 60% of the data set in each season, and the remaining 40% samples are used as the test set. The leading time for prediction is 1 hour, and the nominal confidence of the prediction intervals is 95%.


Table 1 shows the performance indices of the prediction interval obtained by using sparse Bayesian learning, Gaussian kernel density estimation and the method of the present application. It can be seen that the absolute value of the average coverage deviation of the present application is less than 1.4%, and the empirical coverage probability is close to the nominal confidence level of 95%, which has excellent reliability; the average coverage deviation of sparse Bayesian learning in winter, summer and autumn data sets exceeds βˆ’2.6%, which is difficult to ensure the reliability of prediction; although Gaussian kernel density estimation has well reliability in data sets except winter, the prediction intervals obtained in winter, spring, summer and autumn are respectively 16.5%, 28.4%, 34.3% and 16.3% wider than those obtained by the method of the present application. To sum up, the method of the present application can effectively shorten the interval width on the premise of satisfactory reliability of the prediction interval.









TABLE 1







Performance comparison of prediction intervals


obtained by different forecasting methods













Empirical coverage
Average coverage
Average width


Season
Method
probability
deviation
of interval














Winter
Sparse Bayesian
90.92%
βˆ’4.08%
0.2971



learning






Gaussian kernel density
92.05%
βˆ’2.95%
0.3808



estimation






The method of the
93.66%
βˆ’1.34%
0.3268



present application





Spring
Sparse Bayesian
93.58%
βˆ’1.42%
0.2714



learning






Gaussian kernel density
93.39%
βˆ’1.61%
0.3110



estimation






The method of the
94.06%
βˆ’0.94%
0.2423



present application





Summer
Sparse Bayesian
91.53%
βˆ’3.47%
0.2128



learning






Gaussian kernel density
95.76%
0.76%
0.2621



estimation






The method of the
94.42%
βˆ’0.58%
0.1952



present application





Autumn
Sparse Bayesian
92.38%
βˆ’2.62%
0.3572



learning






Gaussian kernel density
96.37%
1.37%
0.4294



estimation






The method of the
95.01%
0.01%
0.3693



present application










FIG. 2 shows the prediction intervals obtained by the method in summer data set and the corresponding real wind power. It can be seen that the prediction intervals obtained by the proposed method can well track the ramp events of wind power, and the width of the prediction intervals can be adaptively adjusted according to the input features, and thus the method has excellent performance. It should be noted that this method is also applicable to interval prediction of power generation and load of other renewable energy sources except wind power, and thus has wide applicability.


The above description of the specific embodiments of the present application is not intended to limit the scope of protection of the present application. All equivalent models or equivalent algorithm flowcharts made according to the content of the description and drawings of the present application, which are directly or indirectly applied to other related technical fields, all fall within the scope of patent protection of the present application.

Claims
  • 1. A chance constrained extreme learning machine method for nonparametric interval forecasting of wind power, comprising the following steps of: 1) constructing a chance constrained extreme learning machine model;2) constructing a sample average approximation model of the chance constrained extreme learning machine;3) constructing a parametric 0-1 loss minimization model ;4) constructing a parametric difference of convex functions optimization model;5) adopting the bisection search algorithm based on difference of convex functions optimization to train the extreme learning machine;wherein the step 1) comprises: comprehensively considering the joint probability distribution of a wind power and an input feature thereof, limiting the wind power to fall into prediction intervals with a probability not lower than a nominal confidence level by using chance constraint, and taking an expectation of minimizing a interval width as a training objective, and constructing the chance constrained extreme learning machine model:
  • 2. The chance constrained extreme learning machine method for nonparametric interval forecasting of wind power according to claim 1, wherein a convex-concave procedure algorithm is adopted to solve the difference of convex functions optimization model with a given parameter, which comprises the following steps: step (1): giving an algorithm convergence accuracy ∈2, the slope parameter m of the difference of convex functions, and the parameter v representing the overall width budget of the prediction intervals;step (2): setting an iteration counter k←0; solving the following linear programming problem to obtain an initial solution ΞΈ(0) of the model:
Priority Claims (1)
Number Date Country Kind
202010255104.2 Apr 2020 CN national
CROSS-REFERENCE TO RELATED APPLICATIONS

The present application is a continuation of International Application No. PCT/CN2021/080895, filed on Mar. 16, 2021, which claims priority to Chinese Application No. 202010255104.2, filed on Apr. 2, 2020, the contents of both of which are incorporated herein by reference in their entireties.

Continuations (1)
Number Date Country
Parent PCT/CN2021/080895 Mar 2021 US
Child 17694689 US