SYSTEM AND METHOD FOR PROBABILISTIC FORECASTING USING MACHINE LEARNING WITH A REJECT OPTION

FIELD

Embodiments of this disclosure relate to implementing machine learning models to predict probability distribution for various practical applications, and in particular, some embodiments relate to implementing machine learning models for probabilistic forecasting.

BACKGROUND

In some applications of machine learning models, range predictions and probabilistic forecasting can be a useful alternative to simple point predictions. However, in some scenarios, inaccurate predictions and uncertainties can have harmful downstream consequences.

SUMMARY

According to an aspect of the present disclosure, there is provided a computer-implemented system for training a neural network for probabilistic forecasting, the system may include: bat least one processor; memory in communication with the at least one processor; instructions stored in the memory, which when executed at the at least one processor causes the system to: maintain a data set representing a neural network having a plurality of weights; receive input data comprising a plurality of time series data sets ending with timestamp t−1; generate, using the neural network and based on the input data, a probabilistic forecast distribution prediction at timestamp t and a selection value associated with the probabilistic forecast distribution prediction at timestamp t; compute a loss function based on the selection value; and update at least one of the plurality of weights of the neural network based on the loss function.

In some embodiments, the probabilistic forecast distribution prediction at timestamp t may include a mean and a variance of the probabilistic forecast distribution prediction.

In some embodiments, the instructions when executed at the at least one processor causes the system to: when the selection value is higher than or equal to a threshold value, store the probabilistic forecast distribution prediction at timestamp t as a valid prediction.

In some embodiments, the instructions when executed at the at least one processor causes the system to: process the stored probabilistic forecast distribution prediction at timestamp t to generate a predicted electricity consumption report.

In some embodiments, the instructions when executed at the at least one processor causes the system to: when the selection value is lower than a threshold value, reject the probabilistic forecast distribution prediction at timestamp t.

In some embodiments, the instructions when executed at the at least one processor causes the system to: generate a signal for causing, at a display device, a display of a graphical user interface showing that the probabilistic forecast distribution prediction at timestamp t has been rejected.

In some embodiments, the instructions when executed at the at least one processor causes the system to: generate a second signal for causing, at the display device, a display of a graphical user interface showing the threshold value.

In some embodiments, the instructions when executed at the at least one processor causes the system to: generate a third signal for causing, at the display device, a display of a graphical user interface showing a graphical user element for modifying the threshold value.

In some embodiments, the neural network comprises a recurrent neural network (RNN) represented by Φ based on:

m
_i,t+1
,v
_i,t+1
,s
_i,t+1
,h
_i,t+1=Φ(m_i,t,h_i,t;θ), wherein:

m_i,t+1represents a mean value of the probabilistic forecast distribution prediction at timestamp t+1 for the ith sample;

v_i,t+1represents a variance v_i,t+1of the probabilistic forecast distribution prediction at timestamp t+1 for the ith sample;

s_i,t+1represents the selection value associated with the probabilistic forecast distribution prediction at timestamp t+1 for the ith sample;

m_i,trepresents a mean value of a probabilistic forecast distribution prediction at timestamp t for the ith sample;

θ represents one or more learnable model parameters for the recurrent neural network;

h_i,trepresents a hidden state vector at timestamp t for the ith sample; and

h_i,t+1represents a hidden state vector at timestamp t+1 for the ith sample.

According to another aspect of the present disclosure, there is provided a computer-implemented method for training a neural network for probabilistic forecasting, the method may include: maintaining a data set representing a neural network having a plurality of weights; receiving input data comprising a plurality of time series data sets ending with timestamp t−1; generating, using the neural network and based on the input data, a probabilistic forecast distribution prediction at timestamp t and a selection value associated with the probabilistic forecast distribution prediction at timestamp t; computing a loss function based on the selection value; and updating at least one of the plurality of weights of the neural network based on the loss function.

In some embodiments, the probabilistic forecast distribution prediction at timestamp t may include a mean and a variance of the probabilistic forecast distribution prediction.

In some embodiments, the method may include: when the selection value is higher than or equal to a threshold value, storing the probabilistic forecast distribution prediction at timestamp t as a valid prediction.

In some embodiments, the method may include: processing the stored probabilistic forecast distribution prediction at timestamp t to generate a predicted electricity consumption report.

In some embodiments, the method may include: processing the stored probabilistic forecast distribution prediction at timestamp t to generate a future financial forecasting statement.

In some embodiments, the method may include: when the selection value is lower than a threshold value, rejecting the probabilistic forecast distribution prediction at timestamp t.

In some embodiments, the method may include: generating a signal for causing, at a display device, a display of a graphical user interface showing that the probabilistic forecast distribution prediction at timestamp t has been rejected.

In some embodiments, the method may include: generating a second signal for causing, at the display device, a display of a graphical user interface showing the threshold value and a graphical user element for modifying the threshold value.

In some embodiments, the neural network may be a recurrent neural network (RNN) represented by Φ based on: