METHOD FOR CALIBRATING DAILY PRECIPITATION FORECAST BY USING BERNOULLI-GAMMA-GAUSSIAN DISTRIBUTION

Information

  • Patent Application
  • 20230152488
  • Publication Number
    20230152488
  • Date Filed
    April 16, 2021
    3 years ago
  • Date Published
    May 18, 2023
    a year ago
Abstract
The present disclosure provides a method for calibrating daily precipitation forecast by using a Bernoulli-Gamma-Gaussian distribution, including the following steps: acquiring daily raw forecast data and observed data; using a Bernoulli distribution to perform precipitation occurrence analysis; using a Gamma distribution to perform precipitation amount analysis on the data that precipitation occurs; using a Gaussian distribution to perform normal transformation on the raw forecast data and the observed data according to the analysis results of the Bernoulli distribution and the Gamma distribution, and obtaining corresponding normalized variables; constructing a bivariate joint normal distribution; constructing a conditional probability distribution of a predictand; and determining whether a forecast to be calibrated is that a precipitation event occurs, determining a conditional probability distribution parameter of the predictand, then randomly sampling the conditional probability distribution of the predictand, and finally obtaining the calibrated forecast by means of inverse normal quantile transform.
Description
TECHNICAL FIELD

The present invention relates to the technical field of hydrologic forecast calibration, and more particularly to a method for calibrating daily precipitation forecast by using a Bernoulli-Gamma-Gaussian distribution.


BACKGROUND

Precipitation is a key link of hydrologic circulation. With the steady development of a global climate model, the global climate model can provide rich daily precipitation forecast information, and provide important reference for the management of watershed water resources and the implementation of flood control and drought fight work. Affected by the structure, parameters, initial and boundary conditions of the global climate model, a raw forecast generated thereby generally contains complex systematic errors and random errors, which is adverse to the engineering application of precipitation forecast. Therefore, at present, the precipitation forecast is calibrated mainly by fitting and analyzing monthly precipitation data by means of Gamma distribution.


However, for daily precipitation data, due to the non-negativity thereof, the daily precipitation data is in a complex skew and discrete-continuous mixed distribution, which brings difficulties to modeling and analysis. Under natural conditions, precipitation is non-negative, that is, precipitation amount does not have a negative value, and the minimum value thereof is zero. In conventional monthly precipitation modeling and analysis, in addition to an extremely droughty region, monthly precipitation generally does not contain the value zero, and the distribution thereof can be considered as an ordinary continuous distribution. Different from monthly precipitation, daily precipitation generally contains a certain amount of the value zero; therefore, the precipitation amount distribution is not a simple continuous distribution, but a discrete-continuous mixed distribution. Gamma distribution has no definition at the point zero. Therefore, daily precipitation cannot be directly modeled and analyzed by means of Gamma distribution, and daily precipitation forecast would be difficult to be calibrated.


SUMMARY OF THE INVENTION

In order to overcome the defect in the prior art that daily precipitation forecast is difficult to be analyzed and calibrated due to a skew and discrete-continuous mixed distribution of daily precipitation data, the present invention provides a method for calibrating daily precipitation forecast by using a Bernoulli-Gamma-Gaussian distribution.


In order to solve the above technical problem, the technical solution of the present invention is as follows:


A method for calibrating daily precipitation forecast by using a Bernoulli-Gamma-Gaussian distribution, including the following steps:


S1, acquiring raw forecast data of daily average precipitation in a watershed area and corresponding observed data of the average precipitation in the watershed area;


S2, using a Bernoulli distribution to perform precipitation occurrence analysis on the raw forecast data and the observed data;


S3, using a Gamma distribution to perform precipitation amount analysis on the raw forecast data and the observed data that precipitation occurs;


S4, performing normal transformation on the raw forecast data and the observed data according to the analysis results of the Bernoulli distribution and the Gamma distribution, and obtaining normalized variables {circumflex over (F)} and Ô corresponding to the raw forecast data and the observed data;


S5, constructing a bivariate joint normal distribution according to the normalized variables {circumflex over (F)} and Ô;


S6, using the normalized variable {circumflex over (F)} of the raw forecast data as a predictor and using the normalized variable Ô of the observed data as a predictand to construct a conditional probability distribution of the predictand Ô; and


S7, determining whether forecast data to be calibrated is that a precipitation event occurs, determining the conditional probability distribution of the predictand Ô, further randomly sampling the conditional probability distribution of the predictand Ô, and finally obtaining the calibrated forecast according to inverse normal quantile transform.


As a preferred solution, the step of using the Bernoulli distribution to perform precipitation occurrence analysis on the raw forecast data and the observed data includes:


S2.1, setting a raw forecast data threshold Tf and an observed data threshold To, and determining whether a precipitation event occurs: if the precipitation amount of the raw forecast data and the precipitation amount of the observed data are less than the corresponding thresholds Tf and To, then determining that no precipitation event occurs; and if the precipitation amount of the raw forecast data and the precipitation amount of the observed data are greater than or equal to the corresponding thresholds Tf and To, then determining that a precipitation event occurs;


S2.2, calculating probabilities qf and qo of the raw forecast data and the observed data that no precipitation event occurs according to the precipitation event occurrence determination result, wherein the calculation formula thereof is as follows:






q=K
0
/K


Wherein K0 denotes the number of the data samples that no precipitation event occurs, and K denotes the total number of the data samples; and performing Bernoulli distribution fitting according to the probability that no precipitation event occurs, wherein the expression formula thereof is as follows:






{




F


B

(

q
f

)







O


B

(

q
o

)









Wherein F=[f1, f2, . . . , fK] denotes the raw forecast data; O=[o1, o2, . . . , oK] denotes the observed data; and B( ) denotes the Bernoulli distribution.


As a preferred solution, the step of using a Gamma distribution to perform precipitation amount analysis on the raw forecast data and the observed data that precipitation occurs includes:


Recording raw forecast data samples which are analyzed to occur a precipitation event as Fc, recording observed data samples which are analyzed to occur a precipitation event as Oc, using the Gamma distribution to respectively fit the raw forecast data samples Fc and the observed data samples Oc, and obtaining marginal distributions thereof, wherein the expression formulas thereof are as follows:






{





F
c



G

(


α
f

,

β
f


)








O
c



G

(


α
o

,

β
0


)









Wherein G( ) denotes the Gamma distribution, and αf, βf, αo, and βo respectively denote Gamma distribution parameters of the raw forecast data and the observed data obtained by means of fitting.


As a preferred solution, the step of performing normal transformation on the raw forecast data and the observed data includes:


S4.1, transforming the raw forecast data and the observed data into corresponding cumulative distribution function values according to the analysis results of the Bernoulli distribution and the Gamma distribution, wherein the calculation formula thereof is as follows:






{





P

f
i


=


m
f

+


(

1
-

m
f


)




CDF

(


α
f

,

β
f


)


(

f
i

)










P

o
i


=


m
o

+


(

1
-

m
o


)




CDF

(


α
o

,

β
o


)


(

o
i

)











Wherein Pfi and Poi denote the cumulative distribution function values of the raw forecast data fi and the observed data oi in an i-th year, and i=1,2, . . . , K; CDFf, βf) and CDFo, βo) respectively denote cumulative distribution functions of the Gamma distribution of the raw forecast data and the observed data; and mf and mo respectively denote the cumulative distribution function values of the raw forecast data and the observed data that no precipitation event occurs;


S4.2, transforming the cumulative distribution function values Pfi and Poi into variables obeying a standard normal distribution by means of an inverse function of the cumulative distribution function in the standard normal distribution, wherein the expression formula thereof is as follows:






{






f
^

i

=


CDF

N

(

0
,

1
2


)



(

P

f
i


)









o
^

i

=


CDF

N

(

0
,

1
2


)





(

P

o
i


)










Wherein CDF′N(0,12)( ) denotes the inverse function of the cumulative distribution function in the standard normal distribution; and {circumflex over (f)}i and ôi respectively denote the variable of raw forecast and the variable of observation which were normal quantile transformed; therefore, the normalized variable {circumflex over (F)}=[{circumflex over (f)}1, {circumflex over (f)}2, . . . , {circumflex over (f)}K] of the raw forecast data and the normalized variable Ô=[ô1, ô2, . . . , ôK] of the observed data obey normal distribution.


As a preferred solution, the expression formula of the bivariate joint normal distribution of the normalized variables {circumflex over (F)} and Ô is as follows:







[




F
^






O
^




]



(


[



0




0



]

,

[



1


ρ




ρ


1



]


)





Wherein ρ denotes a correlation coefficient of the normalized variables {circumflex over (F)} and Ô.


As a preferred solution, the correlation coefficient ρ of the normalized variables {circumflex over (F)} and Ô is calculated with a maximum likelihood estimation method, wherein the calculation formula of a likelihood equation L is as follows:






L
=




i
=
1

K


l
i






Wherein li denotes a likelihood function of the raw forecast data fi and the observed data oi in the i-th year; and the expression formula of the likelihood equation l is as follows:






{




l
=


PDF
BN

=


1

2

π



1
-

ρ
2







e

-


1

2


(

1
-

ρ
2


)



[



f
^

2

-

2

ρ


f
^


o

+

o
2


]











if



f
^


>


f
^

T


,


o
^

>


o
^

T








l
=


1


2

π





e

-



f
^

2

2






CDF


ρ


f
^


,

1
-

ρ
2




(


o
^

T

)








if



f
^


>


f
^

T


,


o
^




o
^

T








l
=


1


2

π





e

-



o
^

2

2





CDF


ρ


o
^


,

1
-

ρ
2






(


f
^

T

)








if



f
^





f
^

T


,


o
^

>


o
^

T








l
=


CDF
BN

(



f
^

T

,


o
^

T


)







if



f
^





f
^

T


,


o
^




o
^

T










Wherein PDFBN denotes a probability density function of the standard bivariate joint normal distribution; CDFρ{circumflex over (f)},1−ρ2( ) and CDFρô,1−ρ2( ) respectively denote cumulative distribution functions of the conditional probability distributions of normalized forecasts and normalized observations in the bivariate joint normal distribution; CDFBN( ) denotes a cumulative distribution function in a standard bivariate joint normal distribution; {circumflex over (f)}T and ôT denote the normal distribution variables corresponding to the thresholds Tf and To.


As a preferred solution, in the likelihood equation L, a golden selection search algorithm is used to calculate a value of the correlation coefficient when the likelihood equation achieves a maximum value.


As a preferred solution, by using the normalized variable {circumflex over (F)}=[{circumflex over (f)}1, {circumflex over (f)}2, . . . , {circumflex over (f)}K] of the raw forecast data as a predictor and using the normalized variable Ô=[ô1, ô2, . . . , ôK] of the observed data as a predictand, the expression formula of the conditional probability distribution of the predictand Ô is obtained as follows:






{





o
^

|


f
^

:


N

(


ρ


f
^


,

1
-

ρ
2



)







if



f
^


>


f
^

T








o
^

|


f
^

:

N


(


ρ



f
^

r


,

1
-

ρ
2



)







if



f
^





f
^

T









Wherein ρ denotes the correlation coefficient of the normalized variables {circumflex over (F)} and Ô; {circumflex over (f)}r denotes a group of random numbers randomly sampled from the standard normal distribution N(0,12) and less than or equal to {circumflex over (f)}T.


As a preferred solution, the step of randomly sampling the corresponding the forecast data according to the conditional probability distribution and then obtaining the calibrated forecast according to inverse normal quantile transform includes: performing determination according to the conditional probability distribution:


If the forecast data to be calibrated {circumflex over (f)}>{circumflex over (f)}T, that is, if the forecast data to be calibrated {circumflex over (f)} is that a precipitation event occurs, then directly calculating a conditional probability distribution parameter of the predictand ô, so as to determine the conditional probability distribution N(ρ{circumflex over (f)},1−ρ2) of the predictand ô; then randomly sampling the conditional probability distribution of the predictand ô, and finally obtaining the calibrated forecast according to inverse normal quantile transform; and


If the forecast data to be calibrated {circumflex over (f)}≤{circumflex over (f)}T, that is, if the forecast data to be calibrated {circumflex over (f)} is that no precipitation event occurs, then first randomly sampling the standard normal distribution N(0,12) to obtain a group of normal distribution random numbers {circumflex over (f)}r less than or equal to {circumflex over (f)}T, and then calculating the conditional probability distribution parameter of the predictand ô according to the value of each of {circumflex over (f)}r, so as to determine the conditional probability distribution N(ρ{circumflex over (f)}r,1−ρ2) of the predictand ô; randomly sampling the conditional probability distribution of the predictand ô, sequentially performing the same calculation on each of {circumflex over (f)}r, and finally collecting all the samples and obtaining the calibrated forecast by means of inverse normal quantile transform.


As a preferred solution, the method further includes the following step: calculating a forecast bias, reliability and skill according to the calibrated forecast result as verification metrics to check a daily precipitation forecast calibration result.


Compared with the prior art, the beneficial effects of the technical solution of the present invention are: by combining Bernoulli distribution and Gamma distribution to model and analyze precipitation occurrence and precipitation amount, and then by combining Gaussian distribution to perform normal quantile transform on the daily precipitation in a skew distribution, the present invention can calibrate the daily precipitation forecast characterized by the skew distribution and the discrete-continuous mixed distribution, and can effectively improve the predictive performance of the daily precipitation forecast.





BRIEF DESCRIPTION OF THE DRAWINGS


FIG. 1 is a flow chart of a method for calibrating daily precipitation forecast by using a Bernoulli-Gamma-Gaussian distribution according to the present invention.



FIG. 2 is a schematic diagram of the method for calibrating daily precipitation forecast by using a Bernoulli-Gamma-Gaussian distribution according to one embodiment.



FIG. 3 is a time series diagram of raw forecast data according to one embodiment.



FIG. 4 is a time series diagram of calibrated forecast according to one embodiment.



FIG. 5 is a diagnostic diagram of the raw forecast data according to one embodiment.



FIG. 6 is a diagnostic diagram of the calibrated forecast according to one embodiment.



FIG. 7 is a reliability diagnosis diagram of the raw forecast data according to one embodiment.



FIG. 8 is a reliability diagnosis diagram of the calibrated forecast according to one embodiment.





DETAILED DESCRIPTION OF EMBODIMENTS

The drawings are used for illustrative purpose only, but should not be considered as a limitation to the present patent.


For a person skilled in the art, that certain commonly known structures in the figures and the descriptions thereof may be omitted is understandable.


The technical solution of the present invention will be further described below with reference to the accompanying drawings and embodiments.


Embodiment

The present embodiment provides a method for calibrating daily precipitation forecast by using a Bernoulli-Gamma-Gaussian distribution. FIGS. 1-2 show flow charts of the method for calibrating daily precipitation forecast by using a Bernoulli-Gamma-Gaussian distribution according to the present embodiment.


The method for calibrating daily precipitation forecast by using a Bernoulli-Gamma-Gaussian distribution provided in the present embodiment includes the following steps:


Step 1, acquiring raw forecast data of daily average precipitation in a watershed area and corresponding observed data of the average precipitation in the watershed area.


In the present embodiment, the raw forecast data and the observed data are acquired by reading a third party library. The present embodiment performs statistical modeling and analysis according to the acquired or read raw forecast data and observed data of average precipitation in the watershed area, uses the Bernoulli distribution to analyze the precipitation occurrence process, and uses the Gamma distribution to analyze the precipitation amount in the precipitation event.


Step 2, using a Bernoulli distribution to perform precipitation occurrence analysis on the raw forecast data and the observed data.


In the present embodiment, “whether a precipitation event occurs” is regarded as a random variable; the variable has two possible values: if it is determined that a precipitation event occurs, the variable value is 1; and if it is determined that no precipitation event occurs, the variable value is 0. Therefore, the precipitation occurrence analysis process can be regarded as a


Bernoulli test, and the precipitation occurrence process can be analyzed by using a Bernoulli distribution. The specific steps include:


S2.1, setting a raw forecast data threshold Tf and an observed data threshold To, and determining whether a precipitation event occurs: if the precipitation amount of the raw forecast data and the precipitation amount of the observed data are less than the corresponding thresholds Tf and To, then determining that no precipitation event occurs; and if the precipitation amount of the raw forecast data and the precipitation amount of the observed data are greater than or equal to the corresponding thresholds Tf and To, then determining that a precipitation event occurs; wherein the thresholds Tf and To are 0 or a real number slightly greater than 0.


S2.2, calculating probabilities qf and qo (namely a failure probability in the Bernoulli distribution) of the raw forecast data and the observed data that no precipitation event occurs according to the precipitation event occurrence determination result, wherein the calculation formula thereof is as follows:






q=K
0/K


Wherein K0 denotes the number of the data samples that no precipitation event occurs, and K denotes the total number of the data samples; and performing Bernoulli distribution fitting according to the probability that no precipitation event occurs, wherein the expression formula thereof is as follows:






{




F


B

(

q
f

)







O


B

(

q
o

)









Wherein F=[f1, f2, . . . , fK] denotes the raw forecast data; O=[o1, o2, . . . , oK] denotes the observed data; and B( ) denotes the Bernoulli distribution.


Step 3, using a Gamma distribution to perform precipitation amount analysis on the raw forecast data and the observed data that precipitation occurs. The specific steps include:


Recording raw forecast data samples which are analyzed to be that a precipitation event occurs as Fc, recording observed data samples which are analyzed to be that a precipitation event occurs as Oc, wherein the samples Fc and Oc denote the precipitation amount in each precipitation event; using the Gamma distribution to respectively fit the raw forecast data samples Fc and the observed data samples Oc, and obtaining marginal distributions thereof, wherein the expression formulas thereof are as follows:






{





F
c

~

G

(


α
f

,

β
f


)








O
c

~

G

(


α
o

,

β
0


)









Wherein G( ) denotes the Gamma distribution, and αf, βf, αo, and βo respectively denote Gamma distribution parameters of the raw forecast data and the observed data obtained by means of fitting, wherein the Gamma distribution parameters are calculated with the maximum likelihood estimation method.


Step 4, performing normal transformation on the raw forecast data and the observed data according to the analysis results of the Bernoulli distribution and the Gamma distribution, and obtaining normalized variables {circumflex over (F)} and Ô corresponding to the raw forecast data and the observed data.


In the present step, a daily precipitation forecast calibration model based on the Bernoulli-Gamma-Gaussian distribution is constructed by combining the analysis results of the Bernoulli distribution and the Gamma distribution with the Gaussian distribution. The specific steps of using Gaussian distribution to perform normal transformation on the raw forecast data and the observed data are as follows:


S4.1, transforming the raw forecast data and the observed data into corresponding cumulative distribution function values according to the analysis results of the Bernoulli distribution and the Gamma distribution, wherein the calculation formula thereof is as follows:






{





P

f
i


=


m
f

+


(

1
-

m
f


)




CDF

(


α
f

,

β
f


)


(

f
i

)










P

o
i


=


m
o

+


(

1
-

m
o


)




CDF

(


α
o

,

β
o


)


(

o
i

)











Wherein Pfi and Poi denote the cumulative distribution function values of the raw forecast data fi and the observed data oi in an i-th year, and i=1,2, . . . , K; CDFf, βf) and CDFo, βo) respectively denote cumulative distribution functions of the Gamma distribution of the raw forecast data and the observed data; and mf and mo respectively denote the cumulative distribution function values of the raw forecast data and the observed data that no precipitation event occurs;


S4.2, transforming the cumulative distribution function values Pfi and Poi into variables obeying a standard normal distribution by means of an inverse function of the cumulative distribution function in the standard normal distribution, wherein the expression formula thereof is as follows:






{






f
ˆ

i

=


CDF

N

(

0
,

1
2


)



(

P

f
i


)









o
ˆ

i

=


CDF

N

(

0
,

1
2


)



(

P

o
i


)









Wherein CDF′N(0,12)( ) denotes the inverse function of the cumulative distribution function in the standard normal distribution; and {circumflex over (f)}i and ôi respectively denote the raw forecast variables and the observed variables which were the normal quantile transformed; therefore, the normalized variable {circumflex over (F)}=[{circumflex over (f)}1, {circumflex over (f)}2, . . . , {circumflex over (f)}K] of the raw forecast data and the normalized variable Ô=[ô1, ô2, . . . , ôK] of the observed data obey normal distribution.


In the present step, the value zero in the daily precipitation data is regarded as an unknown value less than or equal to zero, such that the discrete-continuous distribution of the precipitation is transformed into a single continuous distribution, thereby simplifying the subsequent modeling and analysis process. The step also results in censored data.


Step 5, constructing a bivariate joint normal distribution according to the normalized variables {circumflex over (F)} and Ô. The expression formula is as follows:







[




F
^






O
^




]

~

N

(


[



0




0



]

,

[



1


ρ




ρ


1



]


)





Wherein ρ denotes a correlation coefficient of the normalized variables {circumflex over (F)} and Ô.


Different from the monthly precipitation analysis, affected by the censored data, the correlation coefficient ρ cannot be directly calculated by formula. In the present embodiment, the correlation coefficient is calculated with the maximum likelihood estimation method, wherein the calculation formula of the likelihood equation L is as follows:






{




l
=


PDF
BN

=


1

2

π



1
-

ρ
2







e

-


1

2


(

1
-

ρ
2


)



[


f
2

-

2

ρ


f
^


o

+


o
^

2


]











if



f
^


>


f
^

T


,


o
^

>


o
^

T








l
=


1


2

π





e

-



f
^

2

2





CDF


ρ


f
^

.1

-

ρ
2





(


o
^

T

)








if



f
^


>


f
^

T


,


o
^




o
^

T








l
=


1


2

π





e

-



o
^

2

2






CDF


ρ


o
^

.1

-

ρ
2



(


f
^

T

)








if



f
^





f
^

T


,


o
^

>


o
^

T








l
=


CDF
BN

(



f
^

T

,


o
^

T


)







if



f
^





f
^

T


,


o
^




o
^

T










Wherein li denotes a likelihood equation of the raw forecast data fi and the observed data oi in the i-th year; and the expression formula of the likelihood equation 1 is as follows:






L
=




i
=
1

K


l
i






Wherein PDFBN denotes a probability density function of the standard bivariate joint normal distribution; CDFρ{circumflex over (f)},1−ρ2( ) and CDFρô,1−ρ2( ) respectively denote cumulative distribution functions of the conditional probability distributions of normalized forecasts and normalized observations in the bivariate joint normal distribution; CDFBN( ) denotes a cumulative distribution function in a standard bivariate joint normal distribution; {circumflex over (f)}T and ôT denote the normal distribution variables corresponding to the thresholds Tf and To.


The four likelihood equations 1 in the above formula respectively denote likelihood equations under four situations. The four likelihood equations from top to bottom respectively correspond to: 1) the raw forecast data and the observed data are both that a precipitation event occurs; 2) the raw forecast data is that a precipitation event occurs, and the observed data is that no precipitation event occurs; 3) the raw forecast data is that no precipitation event occurs, and the observed data is that a precipitation event occurs; and 4) the raw forecast data and the observed data are both that no precipitation event occurs.


According to the above likelihood equation L, a golden selection search algorithm is used to calculate a value of the correlation coefficient ρ when the likelihood equation achieves a maximum value.


Step 6, using the normalized variable {circumflex over (F)} of the raw forecast data as a predictor and using the normalized variable Ô of the observed data as a predictand to construct a conditional probability distribution of the predictand Ô. Wherein the expression formula of the conditional probability distribution of the predictand Ô is as follows:






{




ô
|


f
ˆ

:



N

(


ρ


f
ˆ


,

1
-

ρ
2



)



if



f
ˆ


>


f
ˆ

T









ô
|


f
ˆ

:



N

(


ρ



f
ˆ

r


,

1
-

ρ
2



)



if







f
ˆ





f
ˆ

T











Wherein ρ denotes the correlation coefficient of the normalized variables {circumflex over (F)} and Ô; {circumflex over (f)}r denotes a group of random numbers randomly sampled from the standard normal distribution N(0,12) under the condition that {circumflex over (f)} is less than or equal to {circumflex over (f)}r.


Step 7, determining whether forecast data to be calibrated is that a precipitation event occurs, determining the conditional probability distribution parameter of the predictand Ô, further randomly sampling the conditional probability distribution of the predictand, and finally obtaining the calibrated forecast according to inverse normal quantile transform.


Specifically, if the forecast data to be calibrated {circumflex over (f)}>{circumflex over (f)}T, that is, if the forecast data to be calibrated {circumflex over (f)} is that a precipitation event occurs, then directly calculating a conditional probability distribution parameter of the predictand ô, so as to determine the conditional probability distribution N(ρ{circumflex over (f)},1−ρ2) of the ô; then randomly sampling the conditional probability distribution of the ô, and finally obtaining the calibrated forecast according to inverse normal quantile transform; and


If the forecast data to be calibrated {circumflex over (f)}≤{circumflex over (f)}T, that is, if the forecast data to be calibrated {circumflex over (f)} is that no precipitation event occurs, then first randomly sampling the standard normal distribution N(0,12) to obtain a group of normal distribution random numbers {circumflex over (f)}r less than or equal to {circumflex over (f)}T, and then calculating the conditional probability distribution parameter of the predictand ô according to the value of each of {circumflex over (f)}r, so as to determine the conditional probability distribution N(ρ{circumflex over (f)}r,1−ρ2) of the predictand ô; randomly sampling the conditional probability distribution of the predictand ô, sequentially performing the same calculation on each of {circumflex over (f)}r, and finally collecting all the samples and obtaining the calibrated forecast by means of inverse normal quantile transform.


Further, the method further includes: calculating a forecast bias, reliability and skill according to the calibrated forecast result as a verification metric to verify a daily precipitation forecast calibration result, and analyzing the effect of the daily precipitation forecast calibration.


In the present embodiment, by combining Bernoulli distribution and Gamma distribution to model and analyze the different processes of precipitation occurrence and the precipitation amount and then by combining Gaussian distribution to perform normal quantile transform on the daily precipitation in a skew distribution, the method can be used to calibrate the daily precipitation forecast characterized by the skew distribution and the discrete-continuous mixed distribution, and can complete the calibration of the daily precipitation forecast by acquiring the raw forecast data and the observed data only during use.


In a specific embodiment, the ECMWF-S2S daily precipitation forecast of East River in the Pearl River basin is calibrated.


First, a Dataset function in a Python third party library netCDF4 is used to read raw forecast to be calibrated and corresponding observed data (.nc); corresponding data are extracted by means of variable attributes in the object, and are respectively stored in variables named as temp_fore and temp_obs. In the present embodiment, the precipitation forecast data is cumulative precipitation amount in a seven-day forecast period forecast at the beginning of January to December, and the observed data is the observed cumulative precipitation amount corresponding to the forecast.


The read raw forecast data and the observed data are modeled and analyzed: using the Bernoulli distribution to analyze precipitation occurrence probabilities in the samples, and respectively storing the probabilities in variables q_fore and q_obs; using the Gamma distribution to analyze the precipitation distribution characteristic, using a stats.gamma.fit function to fit the precipitation, and respectively storing the obtained parameters in variables para_fore and para_obs.


On such basis, the daily precipitation forecast calibration model based on the Bernoulli-Gamma-Gaussian distribution is constructed; the calibration model mainly includes normal quantile transform, joint distribution construction and conditional probability distribution, and is constructed mainly by means of third party libraries Numpy and Scipy:


1) using the functions stats.gamma.cdf and stats.norm.ppf to perform normal quantile transform on the raw forecast data and the observed data, so as to obtain forecast and observation variables obeying normal distribution, and respectively storing the forecast and observation variables in trans_fore and trans_obs;


2) constructing a bivariate joint normal distribution model, using the functions stats.norm and stats.multivariate_normal to construct a corresponding likelihood equation in a cyclic determination manner, storing the object thereof in a likelihood_function list as an objective function of the golden selection search algorithm, searching for the corresponding correlation coefficient to enable the likelihood equation to achieve a maximum value, and finally storing the corresponding correlation coefficient in the variable corr;


3) on the basis of the bivariate joint normal distribution, calculating the conditional probability distribution parameter of the observed value, wherein the standard deviation sigma can be directly calculated according to the correlation coefficient; if the forecast to be calibrated f is greater than the threshold T, then directly calculating the mean (namely the predictand), using the function stats.norm.rvs to randomly sample 1000 samples; if the forecast to be calibrated f is less than or equal to the threshold T, then first using the functions Numpy.random.uniform and stats.norm.ppf to randomly sample 1000 groups of random values of f, determining the mean (namely the predictand), and then using stats.norm.rvs to perform random sampling one by one; and


4) obtaining the calibrated forecast by means of inverse normal quantile transform according to the random samples obtained in 3).


Further, the sentences class ( ) and def ( ) in Python are used to encapsulate the above steps to form functions, including a Bernoulli_Gamma_Gaussian class, an NQT function, a bi_gaussian function, a golden_section_search function, a conditional_distribution function, a back_NQT and the like; and the functions are stored in a .py file.


The above functions are called by means of an import sentence in the Python; each group of precipitation forecast data are calibrated one by one, so as to obtain a calibrated cumulative seven-day precipitation forecast at the beginning of each month.


Further, Numpy is used to calculate the verification metrics such as forecast bias, reliability and skill and the like; and then Matplotlib is used to draw a forecast diagnosis diagram to verify the raw forecast and the calibrated forecast:


1) using the function nanpercentile in Numpy to respectively calculate the quantiles 10, 25, 50, 75, and 90 of the raw forecast and the calibrated forecast; and using the function pyplot.plot in the third party library Matplotlib to draw precipitation forecast and observation time series diagrams by taking year as the x-axis and the precipitation amount as the y-axis, as shown in FIG. 3 and FIG. 4;


2) on the basis of the quantiles 10, 25, 50, 75, and 90 of the raw forecast and the calibrated forecast, taking a median of an ensemble forecast as the x-axis and the precipitation amount as the y-axis to draw diagnostic diagrams for the raw forecast and the calibrated forecast, and using the pyplot.text to display a verification metric calculation result in the diagrams, as shown in FIG. 5 and FIG. 6; and


3) calculating a reliability verification metric PIT according to the raw forecast, the calibrated forecast, and the observed values, using the function Numpy.sort to perform sequencing in an ascending order, and taking the standard uniform distribution as the x-axis and the sequenced PIT value as the y-axis to draw a reliability diagnostic diagram, as shown in FIG. 7 and FIG. 8.


It can be seen from FIG. 3-FIG. 8 that the present embodiment can effectively calibrated a systematic precipitation forecast deviation for the daily precipitation forecast characterized by the skew distribution and the discrete-continuous mixed distribution, improves the forecast precision, thereby facilitating the engineering application of precipitation forecast. Furthermore, during use, the present invention can complete the calibration of the daily precipitation forecast only by acquiring corresponding raw forecast data and the observed data without manually setting parameters. In addition, the present embodiment encapsulates the steps of the present invention to form functions, and has good portability on the basis of an open source Python language platform, thereby facilitating the application to different systems and platforms.


It is obvious that the above embodiments of the present invention are only examples for clearly illustrating the present invention, but not limitations to the embodiments of the present invention. A person skilled in the art may make various modifications or variations in other forms on the basis of the above description. It is unnecessary and impossible to exhaust all the embodiments herein. Any modifications, equivalent substitutions, improvements and the like made within the spirit and principles of the present invention are all intended to be concluded in the scope of protection of the claims of the present invention.

Claims
  • 1. A method for calibrating daily precipitation forecast by using a Bernoulli-Gamma-Gaussian distribution, comprising the following steps: S1, acquiring raw forecast data of a daily average precipitation in a watershed area and corresponding observed data of an average precipitation in the watershed area;S2, using a Bernoulli distribution to perform a precipitation occurrence analysis on the raw forecast data and observed data;S3, using a Gamma distribution to perform a precipitation amount analysis on the raw forecast data and the observed data that precipitation occurs;S4, performing a normal transformation on the raw forecast data and the observed data according to analysis results of the Bernoulli distribution and the Gamma distribution, and obtaining normalized variable {circumflex over (F)} and normalized variable Ô corresponding to the raw forecast data and the observed data;S5, constructing a bivariate joint normal distribution according to the normalized variable {circumflex over (F)} and normalized variable Ô;S6, using the normalized variable {circumflex over (F)} of the raw forecast data as a predictor and using the normalized variable Ô of the observed data as a predictand to construct a conditional probability distribution of the predictand Ô; andS7, determining whether forecast data to be calibrated is that a precipitation event occurs, determining the conditional probability distribution of the predictand Ô, further randomly sampling the conditional probability distribution of the predictand Ô, and finally obtaining the calibrated forecast according to an inverse normal quantile transform.
  • 2. The method for calibrating daily precipitation forecast according to claim 1, wherein the step of using the Bernoulli distribution to perform the precipitation occurrence analysis on the raw forecast data and the observed data comprises: S2.1, setting a raw forecast data threshold Tf and an observed data threshold To, and determining whether the precipitation event occurs: if a precipitation amount of the raw forecast data and the precipitation amount of the observed data are less than the corresponding thresholds Tf and To, then determining that no precipitation event occurs; and if the precipitation amount of the raw forecast data and the precipitation amount of the observed data are greater than or equal to the corresponding thresholds Tf and To, then determining that the precipitation event occurs;S2.2, calculating probabilities qf and qo of the raw forecast data and the observed data that no precipitation event occurs according to a precipitation event occurrence determination result, wherein a calculation formula thereof is as follows: q=K0/K wherein K0 denotes a number of the data samples that no precipitation event occurs, and K denotes a total number of the data samples; and performing a Bernoulli distribution fitting according to the probabilities that no precipitation event occurs, wherein expression formulas thereof are as follows:
  • 3. The method for calibrating daily precipitation forecast according to claim 2, wherein the step of using the Gamma distribution to perform the precipitation amount analysis on the raw forecast data and the observed data that precipitation occurs comprises: recording raw forecast data samples which are analyzed to be that the precipitation event occurs as Fc, recording observed data samples which are analyzed to be that the precipitation event occurs as Oc, using the Gamma distribution to respectively fit the raw forecast data samples Fc and the observed data samples Oc, and obtaining marginal distributions thereof, wherein the expression formulas thereof are as follows:
  • 4. The method for calibrating daily precipitation forecast according to claim 3, wherein the step of performing the normal transformation on the raw forecast data and the observed data comprises: S4.1, transforming the raw forecast data and the observed data into corresponding cumulative distribution function values according to the analysis results of the Bernoulli distribution and the Gamma distribution, wherein the calculation formula thereof is as follows:
  • 5. The method for calibrating daily precipitation forecast according to claim 4, wherein the expression formulas of the bivariate joint normal distribution of the normalized variable {circumflex over (F)} and normalized variable Ô is as follows:
  • 6. The method for calibrating daily precipitation forecast according to claim 5, wherein the correlation coefficient ρ of the normalized variable {circumflex over (F)} and normalized variable Ô is calculated with a maximum likelihood estimation method, and the calculation formula of a likelihood equation L is as follows:
  • 7. The method for calibrating daily precipitation forecast according to claim 6, wherein in the likelihood equation L, a golden selection search algorithm is used to calculate a value of the correlation coefficient when the likelihood equation achieves a maximum value.
  • 8. The method for calibrating daily precipitation forecast according to claim 5, wherein by using the normalized variable {circumflex over (F)}=[{circumflex over (f)}1, {circumflex over (f)}2, . . . , {circumflex over (f)}K] of the raw forecast data as a predictor and using the normalized variable Ô=[ô1, ô2, . . . , ôK] of the observed data as a predictand, the expression formulas of the conditional probability distribution of the predictand Ô is obtained as follows:
  • 9. The method for calibrating daily precipitation forecast according to claim 8, wherein the step of randomly sampling the corresponding the forecast data according to the conditional probability distribution and then obtaining the calibrated forecast according to the inverse normal quantile transform comprises: performing determination according to the conditional probability distribution: if the forecast data to be calibrated {circumflex over (f)}>{circumflex over (f)}T, that is, if the forecast data to be calibrated {circumflex over (f)} is that the precipitation event occurs, then directly calculating a conditional probability distribution parameter of the predictand ô, so as to determine the conditional probability distribution N(ρ{circumflex over (f)},1−ρ2) of the predictand ô; then randomly sampling the conditional probability distribution of the predictand ô, and finally obtaining the calibrated forecast according to the inverse normal quantile transform; andif the forecast data to be calibrated {circumflex over (f)}≤{circumflex over (f)}T, that is, if the forecast data to be corrected {circumflex over (f)} is that no precipitation event occurs, then first randomly sampling the standard normal distribution N(0,12) to obtain a group of normal distribution random numbers {circumflex over (f)}r less than or equal to {circumflex over (f)}T, and then calculating the conditional probability distribution parameter of the predictand ô according to the value of each of {circumflex over (f)}r, so as to determine the conditional probability distribution N(ρ{circumflex over (f)}r,1−ρ2) of the predictand ô; randomly sampling the conditional probability distribution of the predictand ô, sequentially performing the same calculation on each of {circumflex over (f)}r, and finally collecting all the samples and obtaining the calibrated forecast by means of the inverse normal quantile transform.
  • 10. The method for calibrating daily precipitation forecast according to claim 1, wherein the method further comprises the following step: calculating a forecast bias, reliability, and skill according to the calibrated forecast result as a verification metric to verify a daily precipitation forecast calibration result.
  • 11. The method for calibrating daily precipitation forecast according to claim 2, wherein the method further comprises the following step: calculating a forecast bias, reliability, and skill according to the calibrated forecast result as a verification metric to verify a daily precipitation forecast calibration result.
  • 12. The method for calibrating daily precipitation forecast according to claim 3, wherein the method further comprises the following step: calculating a forecast bias, reliability, and skill according to the calibrated forecast result as a verification metric to verify a daily precipitation forecast calibration result.
  • 13. The method for calibrating daily precipitation forecast according to claim 4, wherein the method further comprises the following step: calculating a forecast bias, reliability, and skill according to the calibrated forecast result as a verification metric to verify a daily precipitation forecast calibration result.
  • 14. The method for calibrating daily precipitation forecast according to claim 5, wherein the method further comprises the following step: calculating a forecast bias, reliability, and skill according to the calibrated forecast result as a verification metric to verify a daily precipitation forecast calibration result.
  • 15. The method for calibrating daily precipitation forecast according to claim 6, wherein the method further comprises the following step: calculating a forecast bias, reliability, and skill according to the calibrated forecast result as a verification metric to verify a daily precipitation forecast calibration result.
  • 16. The method for calibrating daily precipitation forecast according to claim 7, wherein the method further comprises the following step: calculating a forecast bias, reliability, and skill according to the calibrated forecast result as a verification metric to verify a daily precipitation forecast calibration result.
  • 17. The method for calibrating daily precipitation forecast according to claim 8, wherein the method further comprises the following step: calculating a forecast bias, reliability, and skill according to the calibrated forecast result as a verification metric to verify a daily precipitation forecast calibration result.
  • 18. The method for calibrating daily precipitation forecast according to claim 9, wherein the method further comprises the following step: calculating a forecast bias, reliability, and skill according to the calibrated forecast result as a verification metric to verify a daily precipitation forecast calibration result.
PCT Information
Filing Document Filing Date Country Kind
PCT/CN2021/087676 4/16/2021 WO