The present invention relates to the field of hydrologic forecast, and more particularly to a method for calibrating monthly precipitation forecast by using a Gamma-Gaussian distribution.
An existing global climate model can provide abundant precipitation forecast information. A monthly precipitation forecast of a basin extracted from a global precipitation forecast product can provide important reference for reservoir scheduling, agricultural irrigation, flood control, and drought fight in the basin. Monthly raw forecast data and observed data have a good correlation therebetween. However, the system is complex and has a random error, which brings difficulties to the practical application of the precipitation forecast, and affects the forecast skill to a certain extent.
The patent publication No. CN108830419A (publication date: Nov. 16, 2018) provides a joint forecast method for inflow flows of cascaded reservoir group based on ECC post-processing, relates to the technical field of hydrologic forecast, and discloses: a systematic error of ensemble numerical weather forecast data is calibrated on the basis of measured data and a Gamma distribution function. However, the method of the patent still has a random error, and still has a certain impact on the forecast skill.
In order to overcome the defect in the prior art that the system complexity and the random error affect the precipitation forecast skill, the present invention provides a method for calibrating monthly precipitation forecast by using a Gamma-Gaussian distribution.
In order to solve the above technical problem, the technical solution of the present invention is as follows:
A method for calibrating monthly precipitation forecast by using a Gamma-Gaussian distribution, including the following steps:
S1, acquiring forecast data of monthly average precipitation in a watershed area and corresponding observed values of the average precipitation in the watershed area as input data;
S2, performing fitting on the input data by means of a Gamma distribution function;
S3, calculating a cumulative distribution function value of each input data in a corresponding Gamma distribution;
S4, transforming the cumulative distribution function values into variables obeying a standard normal distribution;
S5, constructing a joint normal distribution according to the variables obeying the standard normal distribution to characterize a correlation between the forecast data and the observed values in the input data; and
S6, randomly sampling the observed values according to the correlation, and inversely transforming acquired samples to obtain a calibrated forecast result.
Preferably, in the step S2, the Gamma distribution function is used to perform fitting on the forecast data and the observed values respectively to obtain marginal distributions of the raw forecast data and the observed values; the expression formula thereof is as follows:
Wherein F denotes a set of K acquired forecast data [f1, f2, . . . , fK]; O denotes a set of K acquired observed values [o1, o2, . . . , oK]; G(*) denotes the Gamma distribution function; αf and βf denote Gamma distribution parameters of the forecast data obtained by means of fitting; and αo and βo denote Gamma distribution parameters of the observed values obtained by means of fitting.
Preferably, the Gamma distribution parameters αf, βf, αo, and βo are respectively calculated with a maximum likelihood estimation method.
Preferably, in the step S3, a cumulative distribution function in the corresponding Gamma distribution is used to calculate the cumulative distribution function values of each forecast data fi and of each observed value oi in the corresponding Gamma distribution; the expression formula thereof is as follows:
Wherein Pf
Preferably, in the step S4, the cumulative distribution function values are regarded as quantiles of the standard normal distribution; the cumulative distribution function values are transformed into variables obeying the standard normal distribution by means of an inverse function of the cumulative distribution function in the standard normal distribution; and the expression formula thereof is as follows:
Wherein PPFN(0,1
Wherein N(0,12) denotes a standard normal distribution.
Preferably, in the step S5, a joint normal distribution is constructed according to the variables {circumflex over (F)} and Ô obeying the standard normal distribution to characterize the correlation between the forecast data and the observed values in the input data; and the expression formula thereof is as follows:
Wherein ρ denotes the correlation between the variables {circumflex over (F)} and Ô.
Preferably, in the step S6, the specific steps are as follows:
S6.1, taking the forecast data {circumflex over (f)} as a predictor, taking the observed value Ô corresponding to each forecast data as a predictand, and calculating a conditional probability distribution of the predictand, wherein the calculation formula is as follows:
ô|{circumflex over (f)}˜N(ρ{circumflex over (f)},1−ρ2)
S6.2, randomly sampling the conditional probability distribution result of the observed value Ô, and inversely transforming sampled samples according to the cumulative distribution function in the standard normal distribution and the inverse function of the cumulative distribution function in the Gamma distribution obtained by performing fitting on the observed value, so as to obtain a calibrated forecast result.
Preferably, the method further includes the following step: calculating a bias value and forecast skill according to the calibrated forecast result as forecast verification metrics.
Preferably, the method further includes the following step: drawing a forecast diagnostic diagram according to the calibrated forecast result, the bias value, and the forecast skill.
Preferably, in the forecast diagnostic diagram, a calibrated forecast median is used as the x axis; a precipitation forecast distribution interval and the observed values are used as the y axis; and the calculation results of the bias value and the forecast skill are inserted in the forecast diagnostic diagram for display.
Compared with the prior art, the beneficial effects of the technical solution of the present invention are: the present invention transforms the precipitation forecast and the observed data into normal distributions by means of the Gamma distribution, thereby avoiding the complex data normalization method; furthermore, the present invention constructs a joint normal distribution according to the variables obeying the standard normal distribution to characterize the correlation between the forecast data and the observed values in the input data, and further randomly samples the observed values according to the correlation, thereby effectively quantifying the random error, solving the problem that the system complexity and the random error affect the precipitation forecast skill, and effectively improving the forecast skill.
The drawings are used for illustrative purpose only, but should not be considered as a limitation to the present patent.
For a person skilled in the art, it is understandable that certain commonly known structures in the figures and the descriptions thereof can be omitted.
The technical solution of the present invention will be further described below with reference to the accompanying drawings and embodiments.
The present embodiment provides a method for calibrating monthly precipitation forecast by using a Gamma-Gaussian distribution.
The method for calibrating monthly precipitation forecast by using a Gamma-Gaussian distribution provided by the present embodiment includes the following steps:
S1, acquiring forecast data of monthly average precipitation in a watershed area and corresponding observed values of the average precipitation in the watershed area as input data.
S2, performing fitting on the input data by means of a Gamma distribution function.
Specifically, the Gamma distribution function is used to perform fitting on the forecast data and the observed values respectively to obtain marginal distributions of the raw forecast data and the observed values; the expression formula thereof is as follows:
Wherein F denotes a set of K acquired forecast data [f1, f2, . . . , fK]; O denotes a set of K acquired observed values [o1, o2, . . . , oK]; G(⋅) denotes the Gamma distribution function; αf and βf denote Gamma distribution parameters of the forecast data obtained by means of fitting; and αo and βo denote Gamma distribution parameters of the observed values obtained by means of fitting.
Wherein the Gamma distribution parameters αf, βf, αo, and βo are respectively calculated with a maximum likelihood estimation method.
S3, calculating a cumulative distribution function value of each input data in a corresponding Gamma distribution.
Specifically, a cumulative distribution function in the corresponding Gamma distribution is used to calculate the cumulative distribution function value of each forecast data fi and of each observed value oi in the corresponding Gamma distribution; the expression formula thereof is as follows:
Wherein Pf
S4, transforming the cumulative distribution function values into variables obeying a standard normal distribution.
Specifically, the cumulative distribution function values are regarded as quantiles of the standard normal distribution; the cumulative distribution function values are transformed into variables obeying the standard normal distribution by means of an inverse function of the cumulative distribution function in the standard normal distribution; and the expression formula thereof is as follows:
Wherein PPFN(0,1
Wherein N(0,12) denotes a standard normal distribution.
S5, constructing a joint normal distribution according to the variables obeying the standard normal distribution to characterize a correlation between the forecast data and the observed values in the input data.
Specifically, a joint normal distribution is constructed according to the variables {circumflex over (F)} and Ô obeying the standard normal distribution to characterize the correlation between the forecast data and the observed values in the input data; and the expression formula thereof is as follows:
Wherein ρ denotes the correlation between the variables {circumflex over (F)} and Ô.
S6, randomly sampling the observed values according to the correlation, and inversely transforming acquired samples to obtain a calibrated forecast result. The specific steps thereof are as follows:
S6.1, taking the forecast data {circumflex over (f)} as a predictor, taking the observed value Ô corresponding to each forecast data as a predictand, and calculating a conditional probability distribution of the predictand, wherein the calculation formula is as follows:
ô|{circumflex over (f)}˜N(ρ{circumflex over (f)},1−ρ2);
S6.2, randomly sampling the conditional probability distribution result of the observed value Ô, and inversely transforming sampled samples according to the cumulative distribution function in the standard normal distribution and the inverse function of the cumulative distribution function in the Gamma distribution obtained by performing fitting on the observed value, so as to obtain a calibrated forecast result.
Further, the method further includes the following steps: calculating a bias value and forecast skill according to the calibrated forecast result as forecast verification metrics; and drawing a forecast diagnostic diagram according to the calibrated forecast result, the bias value, and the forecast skill.
Wherein in the forecast diagnostic diagram, a calibrated forecast median is used as the x axis; a precipitation forecast distribution interval and the observed values are used as the y axis; and the calculation results of the bias value and the forecast skill are inserted in the forecast diagnostic diagram for display.
The method for calibrating monthly precipitation forecast by using a Gamma-Gaussian distribution provided by the present embodiment can be implemented on an open source language platform Python.
In a specific implementation process, a function read_csv of an open source third party library Pandas of Python is used to read the precipitation forecast and the observed data in a pre-stored file, so as to obtain the input data to be acquired in the step S1. Then, the mathematical calculation processes in the steps S2-S6 are programmed on the language platform Python mainly by using third party libraries Numpy and Scipy. The steps are encapsulated to form class functions in the class ( ) and def ( ) forms. And the class functions can be called to calibrate the precipitation forecast.
On the basis that the calibrated forecast is obtained, the forecast verification metrics such as the bias and the forecast skill CRPSS are calculated by means of Numpy, and then a forecast diagnostic diagram is drawn by means of a third party library Matplotlib of the Python, so as to compare and analyze the improvement effect of the calibrated forecast result in the present embodiment.
In the present embodiment, the precipitation forecast and the observed data are transformed into normal distributions by means of the Gamma distribution, thereby avoiding the complex data normalization method; furthermore, a joint normal distribution is constructed according to the variables obeying the standard normal distribution to characterize the correlation between the forecast data and the observed values in the input data, and the observed values are randomly sampled according to the correlation, thereby effectively quantifying the random error, solving the problem that the system complexity and the random error affect the precipitation forecast skill, and effectively improving the forecast skill.
The present embodiment provides a specific implementation process. In such process, the method for calibrating monthly precipitation forecast by using a Gamma-Gaussian distribution provided by the embodiment 1 is used to calibrate, on the platform Python, the monthly precipitation forecast ECMWF-S2S of Beijiang in the Pearl River basin.
First, forecast data of monthly average precipitation in a watershed area and corresponding observed values of the average precipitation in the watershed area are acquired as input data; the input data of the present embodiment are stored in a csv file, as shown in table 1 and table 2.
Wherein the precipitation forecast data is cumulative precipitation amount in a thirty-day forecast period forecast at the beginning of January to December.
During implementation, the raw forecast data to be calibrated and the observed data are read by means of the function read_csv, and are respectively stored in the variables temp_x and temp_y.
A monthly precipitation forecast calibration model based on a Gamma-Gaussian distribution is constructed. The mathematical calculation processes in the steps S2-S6 in the embodiment 1 are executed mainly by means of the third party library Scipy, mainly including Gamma distribution fitting, joint normal distribution construction, and conditional probability distribution.
Specifically, a function stats.gamma.fit is used to respectively perform Gamma distribution fitting on mean value of the raw forecast data and the observed values; and the Gamma distribution parameters are obtained with the maximum likelihood estimation method, and are stored in the variables para_x and para_y.
A function stats.gamma.cdf is used to calculate the cumulative distribution function values of the raw forecast data and the observed data according to the fitted Gamma distribution parameters.
A function stats.norm.ppf is used to transform the cumulative distribution function values into variables obeying a normal distribution according to the calculated cumulative distribution function values, so as to normalize the raw forecast data and the observed values; the normalized data are respectively stored in variables trans_x and trans_y, facilitating subsequent modeling. Furthermore, a function pyplot in the Matplotlib and a function stats.proplot in the Scipy are used to draw the quantile-quantile plot of the observed precipitation values before and after the transformation to verify the normality thereof, as shown in
A joint normal distribution model is constructed; a correlation coefficient between the transformed forecast data variable trans_x and the observed value variable trans_y is calculated by means of a function stats.pearson, so as to characterize the correlation therebetween.
Conditional probability distribution parameters of the observed values, namely the mean and the standard deviation sigma, are calculated; the mean and the sigma are inputted into a function stats.norm.rvs as parameters to perform random sampling, so as to obtain 1000 samples.
The cumulative distribution function values of the 1000 randomly sampled samples in the standard normal distribution are calculated; and finally the samples are inversely transformed according to the Gamma distribution parameter para_y to obtain a calibrated forecast result.
For the forecast in each month, the raw forecast data is calibrated according to the above steps, and finally a group of calibrated forecast results of the month are obtained. The forecast data of 12 months are sequentially calibrated to obtain 12 groups of calibrated forecast results. And the 12 groups of raw forecast data and the calibrated forecast results are put together to perform forecast verification.
Specifically, a percentile function in Numpy is used to respectively calculate the quantiles 10, 25, 50, 75, and 90 of the raw forecast and the calibrated forecast; and a function pyplot.plot in Matplotlib is used to draw precipitation forecast time series diagrams by taking year as the x-axis and the precipitation amount as the y-axis, as shown in
In
In
The forecast skills CRPSS of the calibrated forecast from January to December are respectively −2.71%, 1.68%, −4.77%, 27.84%, 15.00%, 7.79%, 19.83%, 3.68%, −8.17%, 31.34%, −0.68%, and 33.41%.
By comparing the above biases and the forecast skills CRPSS, it can be seen that after the method for calibrating monthly precipitation forecast by using a Gamma-Gaussian distribution is used, it is obvious that the biases of the calibrated forecast result are effectively reduced than that of the raw forecast data; the biases in the calibrated forecast are basically within 1.5%, and the forecast skill CRPSS thereof are more stable.
In addition, the present embodiment can use the sentences class ( ) and def ( ) in the Python to encapsulate each step of the precipitation forecast calibration to form class functions which are respectively gamma_fit, trans_norm, back_trans, conditional_distribution, and gamma_gaussian; the class functions are stored in a .py file; when in use, the precipitation forecast in a basin can be calibrated only by calling the class functions by means of an import sentence.
The same or similar reference signs correspond to the same or similar components.
The words for describing position relationships in the drawings are used for illustrative purpose only, but should not be considered as a limitation to the present patent.
It is obvious that the above embodiments of the present invention are only examples for clearly illustrating the present invention, but not limitations to the embodiments of the present invention. A person skilled in the art may make various modifications or variations in other forms on the basis of the above description. It is unnecessary and impossible to exhaust all the embodiments herein. Any modifications, equivalent substitutions, improvements and the like made within the spirit and principles of the present invention are all intended to be concluded in the scope of protection of the claims of the present invention.
Number | Date | Country | Kind |
---|---|---|---|
202011303631.2 | Nov 2020 | CN | national |
This application is a continuation of international application of PCT application serial no. PCT/CN2020/130457, filed on Nov. 20, 2020, which claims the priority benefit of China application no. 202011303631.2 filed on Nov. 19, 2020. The entirety of each of the above-mentioned patent applications is hereby incorporated by reference herein and made a part of this specification.
Number | Date | Country | |
---|---|---|---|
Parent | PCT/CN2020/130457 | Nov 2020 | US |
Child | 17948240 | US |