The disclosure relates to a method of load forecasting, and an apparatus for the same, and more particularly to a method of short-term load forecasting via active deep multitask learning, and an apparatus for the same.
Electric load forecasting is essential for the secure and economic operation of the power grid. Depending on the forecasting horizon, electric load forecasting ranges from short-term (hours or minutes ahead) to long-term (years ahead). Short-Term electricity Load Forecasting (STLF) is mainly used to assist real-time energy dispatching while long-term load forecasting is mainly applied for power grid infrastructure planning. Accurate short-term electric load forecasting can facilitate efficient residential energy management and power grid operation. As electricity is hard to store in large quantities and considering the safety requirements of power systems, it is of critical importance to keep the power generation as close to the actual power demand as possible. There is also a significant financial incentive for accurate power demand estimation. It is estimated that even a 1% forecasting error increase could lead to more than £10 million increase for the operation cost of the UK power grid.
The modem power grid is facing fundamental changes and, as a result, is evolving into a more and more sustainable system. The use of renewable energy generation, including wind and solar power generation, has increased exponentially over the last 10 years. The output level of renewable energy sources can be quite intermittent and is highly influenced by weather conditions. Besides uncertainties in power generation, there are increasing uncertainties on the demand side caused by electric vehicles (EVs) and the use of other high-demand electric appliances. The adoption of EVs has been growing very fast over the last few years. The annual sale of EVs increased by 79% in Canada and 81% in the US in 2018. EV charging demand is highly affected by the driving behaviors of individuals. Due to these factors, accurate short-term residential load forecasting is becoming more and more challenging.
According to an aspect of the disclosure, a method of load forecasting using multi-task deep learning may include inputting, into a first cluster model, present environmental data and present calendar data corresponding to first commodity consuming objects of a first cluster, among a plurality of commodity consuming objects corresponding to a plurality of clusters; and predicting, based on an output of the first cluster model, a future commodity consumption for each of the first commodity consuming objects of the first cluster. The first cluster model may be trained based on first reference commodity consumption data, first reference environmental data, and first reference calendar data, with regard to the first cluster corresponding to the first commodity consuming objects. The plurality of commodity consuming objects may be clustered into the plurality of clusters based on reference commodity consumption data, reference environmental data, and reference calendar data that are obtained over a period of time. The first cluster model may include a first multi-task learning process having a first joint loss function, and each input of the first multi-task learning process may correspond to a respective commodity consuming object of the first cluster.
According to another aspect of the disclosure, an apparatus for forecasting load using multi-task deep learning may include at least one memory storing instructions; and at least one processor configured to execute the instructions to: input, into a first cluster model, present environmental data and present calendar data corresponding to first commodity consuming objects of a first cluster, among a plurality of commodity consuming objects corresponding to a plurality of clusters; and predict, based on an output of the first cluster model, a future commodity consumption for each of the first commodity consuming objects of the first cluster. The first cluster model may be trained based on first reference commodity consumption data, first reference environmental data, and first reference calendar data, with regard to the first cluster corresponding to the first commodity consuming objects. The plurality of commodity consuming objects may be clustered into the plurality of clusters based on reference commodity consumption data, reference environmental data, and reference calendar data that are obtained over a period of time. The first cluster model may include a first multi-task learning process having a first joint loss function, and each input of the first multi-task learning process may correspond to a respective commodity consuming object of the first cluster.
According to another aspect of the disclosure, a non-transitory computer-readable medium may store instructions including one or more instructions that, when executed by one or more processors, cause the one or more processors to: input, into a first cluster model, present environmental data and present calendar data corresponding to first commodity consuming objects of a first cluster, among a plurality of commodity consuming objects corresponding to a plurality of clusters; and predict, based on an output of the first cluster model, a future commodity consumption for each of the first commodity consuming objects of the first cluster. The first cluster model may be trained based on first reference commodity consumption data, first reference environmental data, and first reference calendar data, with regard to the first cluster corresponding to the first commodity consuming objects. The plurality of commodity consuming objects may be clustered into the plurality of clusters based on reference commodity consumption data, reference environmental data, and reference calendar data that are obtained over a period of time. The first cluster model may include a first multi-task learning process having a first joint loss function, and each input of the first multi-task learning process may correspond to a respective commodity consuming object of the first cluster.
According to another aspect of the disclosure, a method of load forecasting using multi-task deep learning may include obtaining reference commodity consumption data, reference environmental data, and reference calendar data for a plurality of commodity consuming objects over a period of time; clustering the plurality of commodity consuming objects into a plurality of clusters based on the obtained reference commodity consumption data, the plurality of clusters comprising a first cluster and a second cluster; obtaining a first cluster model based on: first reference commodity consumption data, among the obtained reference commodity consumption data, corresponding to first commodity consuming objects of the first cluster; first reference environmental data, among the obtained reference environmental data, corresponding to the first commodity consuming objects of the first cluster; and first reference calendar data, among the obtained reference calendar data, corresponding to the first commodity consuming objects of the first cluster; inputting, into the first cluster model, present environmental data and present calendar data corresponding to the first commodity consuming objects of the first cluster; and predicting, based on an output of the first cluster model, a future commodity consumption for each of the first commodity consuming objects of the first cluster. The first cluster model may include a first multi-task learning process having a first joint loss function, and each input of the first multi-task learning process may correspond to a respective commodity consuming object of the first cluster.
Additional aspects will be set forth in part in the description that follows and, in part, will be apparent from the description, or may be learned by practice of the presented embodiments of the disclosure.
The above and other aspects, features, and aspects of embodiments of the disclosure will be more apparent from the following description taken in conjunction with the accompanying drawings, in which:
The following detailed description of example embodiments refers to the accompanying drawings. The same reference numbers in different drawings may identify the same or similar elements.
The foregoing disclosure provides illustration and description, but is not intended to be exhaustive or to limit the implementations to the precise form disclosed. Modifications and variations are possible in light of the above disclosure or may be acquired from practice of the implementations.
As used herein, the term “component” is intended to be broadly construed as hardware, firmware, or a combination of hardware and software.
It will be apparent that systems and/or methods, described herein, may be implemented in different forms of hardware, firmware, or a combination of hardware and software. The actual specialized control hardware or software code used to implement these systems and/or methods is not limiting of the implementations. Thus, the operation and behavior of the systems and/or methods were described herein without reference to specific software code—it being understood that software and hardware may be designed to implement the systems and/or methods based on the description herein.
Even though particular combinations of features are recited in the claims and/or disclosed in the specification, these combinations are not intended to limit the disclosure of possible implementations. In fact, many of these features may be combined in ways not specifically recited in the claims and/or disclosed in the specification. Although each dependent claim listed below may directly depend on only one claim, the disclosure of possible implementations includes each dependent claim in combination with every other claim in the claim set.
No element, act, or instruction used herein should be construed as critical or essential unless explicitly described as such. Also, as used herein, the articles “a” and “an” are intended to include one or more items, and may be used interchangeably with “one or more.” Furthermore, as used herein, the term “set” is intended to include one or more items (e.g., related items, unrelated items, a combination of related and unrelated items, etc.), and may be used interchangeably with “one or more.” Where only one item is intended, the term “one” or similar language is used. Also, as used herein, the terms “has,” “have,” “having,” or the like are intended to be open-ended terms. Further, the phrase “based on” is intended to mean “based, at least in part, on” unless explicitly stated otherwise.
As shown in
Once the clustering is complete, the houses 1-N are grouped into clusters C1-Cm.
The cluster specific forecasting model may be trained using historical time sequence data corresponding to house 1 through house P (houses within the cluster). For example, the historical time sequence data (reference data) may include electric load consumption data, temperature data, weather data, and the day of the week (e.g. weekday or weekend) corresponding to the houses 1-N. The reference data is not limited to the above examples, and may include other type of data that may be indicative of future electric load.
As shown in
By clustering the commodity consuming objects (houses) and forecasting the electric loads of the houses based on cluster specific MTL-LSTM models, the accuracy of the forecast may be increased.
To perform short-term load forecasting for the houses in the cluster corresponding to a forecasting model, predictive information may be input into the input layer of the MTL-LSTM model. For example, present electric load consumption data, present temperature data, present weather data, present time, and the present day of the week may be input into the input layer.
The input data may then be fed into the LSTM blocks. Depending on the nature of the data, the LSTM blocks may be composed of different numbers of LSTM layers.
The output of the LSTM blocks may then be fed into a fully connected NN output layer. Multi-task learning may be provided by jointly predicting multiple outputs, with each of the outputs of the output layer corresponding to one of the single learning tasks. Depending on details of the forecasting tasks, there may be different inputs and outputs.
According to an embodiment, a method may be used to forecast a load for a single commodity consuming object, as opposed to the aggregate load forecasting of method 100. For example, in single house load forecasting, each house 1-N in a group may be clustered similar to the clustering performed by method 100. However, the cluster specific forecasting model for single home load forecasting may be different than the cluster specific forecasting model for aggregate load forecasting.
As shown in
In the method for forecasting a load for a single commodity consuming object, the forecast may not consider information for clusters other than the cluster including the single house being forecast. That is, the forecasting model 400 may be trained based on historical time sequence data from only houses of the cluster including the target house being forecast, may input current time sequence data from only houses of the cluster including the target house, and may determine the load forecast for the target house based on only the output of the model 400.
The forecasting method may be performed by electronic device 500 of
As shown in
Bus 510 may include a circuit for connecting the components 520, 530, 540, and 550 with one another. Bus 510 may function as a communication system for transferring data between the components, or between electronic devices.
Processor 520 may include one or more of a central processing unit (CPU), a graphics processor unit (GPU), an accelerated processing unit (APU), many integrated core (MIC), a field-programmable gate array (FPGA), or a digital signal processing (DSP). Processor 520 may control at least one of other components of electronic device 500, and/or perform an operation or data processing relating to communication. Processor 520 may execute one or more programs stored in memory 530.
Memory 530 may include a volatile and/or a non-volatile memory. Memory 530 may store information, such as one or more commands, data, programs (one or more instructions), or applications, etc., that is related to at least one other component of the electronic device 500 and for driving and controlling electronic device 500. For example, commands or data may formulate an operating system (OS). Information stored in memory 530 may be executed by processor 520.
The application may include one or more embodiments as discussed above. These functions can be performed by a single application or by multiple applications that each carry out one or more of these functions.
Display 550 may include, for example, a liquid crystal display (LCD), a light emitting diode (LED) display, an organic light emitting diode (OLED) display, a quantum-dot light emitting diode (QLED) display, a microelectromechanical systems (MEMS) display, or an electronic paper display. Display 550 can also be a depth-aware display, such as a multi-focal display. Display 550 is able to present, for example, various contents (such as text, images, videos, icons, or symbols).
Interface 540 may include input/output (I/O) interface 541, communication interface 542, and/or one or more sensors 543. I/O interface 541 serves as an interface that can, for example, transfer commands or data between a user or other external devices and other component(s) of electronic device 500.
Sensor(s) 543 may meter a physical quantity or detect an activation state of electronic device 500 and may convert metered or detected information into an electrical signal. For example, sensor(s) 543 may include one or more cameras or other imaging sensors for capturing images of scenes. The sensor(s) 543 may also include a microphone, a keyboard, a mouse, one or more buttons for touch input, a gyroscope or gyro sensor, an air pressure sensor, a magnetic sensor or magnetometer, an acceleration sensor or accelerometer, a grip sensor, a proximity sensor, a color sensor (such as a red green blue (RGB) sensor), a bio-physical sensor, a temperature sensor, a humidity sensor, an illumination sensor, an ultraviolet (UV) sensor, an electromyography (EMG) sensor, an electroencephalogram (EEG) sensor, an electrocardiogram (EGG) sensor, an infrared (IR) sensor, an ultrasound sensor, an iris sensor, or a fingerprint sensor. The sensor(s) 543 can further include an inertial measurement unit. In addition, sensor(s) 543 can include a control circuit for controlling at least one of the sensors included here. Any of these sensor(s) 543 can be located within or coupled to electronic device 500. Sensor(s) 543 may be used to detect touch input, gesture input, hovering input using an electronic pen or a body portion of a user, etc.
Communication interface 542, for example, may be able to set up communication between electronic device 500 and an external electronic device (such as a first electronic device 502, a second electronic device 504, or a server 506 as shown in
The first and second external electronic devices 502 and 504 and server 506 may each be a device of a same or a different type than electronic device 500. According to some embodiments, server 506 may include a group of one or more servers. Also, according to some embodiments, all or some of the operations executed on electronic device 500 may be executed on another or multiple other electronic devices (such as electronic devices 502 and 504 or server 506). Further, according to some embodiments, when electronic device 500 should perform some function or service automatically or at a request, electronic device 500, instead of executing the function or service on its own or additionally, can request another device (such as electronic devices 502 and 504 or server 506) to perform at least some functions associated therewith. The other electronic device (such as electronic devices 502 and 504 or server 506) may be able to execute the requested functions or additional functions and transfer a result of the execution to electronic device 500. Electronic device 500 can provide a requested function or service by processing the received result as it is or additionally. To that end, a cloud computing, distributed computing, or client-server computing technique may be used, for example. While
Server 506 may include the same or similar components 510, 520, 530, 540, and 550 as electronic device 500 (or a suitable subset thereof). Server 506 may support driving electronic device 500 by performing at least one of operations (or functions) implemented on electronic device 500. For example, server 506 can include a processing module or processor that may support processor 520 of electronic device 500.
The wireless communication may be able to use at least one of, for example, long term evolution (LTE), long term evolution-advanced (LTE-A), 5th generation wireless system (5G), millimeter-wave or 60 GFIz wireless communication, Wireless USB, code division multiple access (CDMA), wideband code division multiple access (WCDMA), universal mobile telecommunication system (UMTS), wireless broadband (WiBro), or global system for mobile communication (GSM), as a cellular communication protocol. The wired connection may include, for example, at least one of a universal serial bus (USB), high definition multimedia interface (HDMI), recommended standard 232 (RS-232), or plain old telephone service (POTS). The network 610 or 612 includes at least one communication network, such as a computer network (like a local area network (LAN) or wide area network (WAN)), Internet, or a telephone network.
Although
The forecasting method may be written as computer-executable programs or instructions that may be stored in a medium.
The medium may continuously store the computer-executable programs or instructions, or temporarily store the computer-executable programs or instructions for execution or downloading. Also, the medium may be any one of various recording media or storage media in which a single piece or plurality of pieces of hardware are combined, and the medium is not limited to a medium directly connected to electronic device 100, but may be distributed on a network. Examples of the medium include magnetic media, such as a hard disk, a floppy disk, and a magnetic tape, optical recording media, such as CD-ROM and DVD, magneto-optical media such as a floptical disk, and ROM, RAM, and a flash memory, which are configured to store program instructions. Other examples of the medium include recording media and storage media managed by application stores distributing applications or by websites, servers, and the like supplying or distributing other various types of software.
The forecasting method may be provided in a form of downloadable software. A computer program product may include a product (for example, a downloadable application) in a form of a software program electronically distributed through a manufacturer or an electronic market. For electronic distribution, at least a part of the software program may be stored in a storage medium or may be temporarily generated. In this case, the storage medium may be a server or a storage medium of server 106.
In operation 710, reference time series commodity consumption data and corresponding support data corresponding to a plurality of commodity consuming objects may be obtained (i.e. reference data is obtained). For example, for electric load forecasting, historical electric load data for each of a plurality of houses (consumption data) may be acquired over a past month along with temperature, weather, time, and day of the week data (support data) that is acquired in conjunction with the electric load data over the past month.
In an embodiment that forecasts communication system traffic, traffic load data (consumption data) acquired during a past year may be obtained along with time, day of the week, day of the month, and day of the year data (support data) that corresponds to the acquired historical traffic load data.
In operation 720, the plurality of commodity consuming objects may be clustered into clusters based on their historical consumption data.
As shown in
In operation 730, cluster models may be obtained for each cluster. According to an embodiment, obtaining a cluster model may include training an MTL-LSTM forecasting model with reference data (e.g. consumption data and corresponding support data).
As shown in
MTL=(1(ŷi,yi), . . . ,k(ŷi,yi)) [1]
In Equation 1, k(yi, y1) is the individual loss function for the k′th task. Each individual loss function may correspond to a commodity consuming object in the cluster corresponding to the cluster model. According to an embodiment, for aggregate load forecasting, all of the individual learning tasks may be treated equally.
In operation 740, future commodity consumption for each commodity consuming object may be forecasted by inputting present data into the obtained cluster models. As such, the future commodity consumption may be determined for each cluster using a corresponding cluster model. That is, each cluster model may input present data corresponding to the commodity consuming objects in the cluster corresponding to the model and output a commodity consumption forecast for each object in the cluster.
For example, when forecasting electric loads of houses, each cluster of houses may be forecasted independently using their respective cluster model. For each cluster, consumption data of each house in the cluster for the previous three hours and the current hour, temperature data from each house for the previous three hours and the current hour, and the day of the week (e.g. weekday versus weekend) may be input into their trained cluster model to predict the future electricity consumption (e.g. next hour) for each house in the cluster.
In operation 750, a final load forecast may be obtained by combining the forecasted future commodity consumption for each of the clusters. Per-cluster forecasts may be obtained using a fully connected neural network layer which inputs each forecast for the cluster. The per-cluster forecasts may then be combined using a fully connected neural network output layer to form a final forecast of the aggregated electric load for each of the houses being forecasted.
Algorithm 1 below shows an embodiment of an aggregate load forecasting algorithm that is consistent with method 800.
In operation 810, reference time series data may be obtained for each commodity consuming object in a group.
In operation 820, the plurality of commodity consuming objects may be clustered into a plurality of clusters based on historical consumption data.
In operation 830, a cluster model for a cluster including a target commodity consuming object (“target object”) being forecasted may be obtained. According to an embodiment, the cluster model may be an MTL-LSTM forecasting model. The MTL-LSTM model may be designed to optimize a loss function that focuses on improving the forecasting accuracy of the target object. The MTL-LSTM model may include a main task (target object) and auxiliary tasks (objects in the cluster other than the target object).
According to an embodiment, the joint loss function may be defined by the following Equation 2.
In Equation 2, m shows the loss for the main task (target object), t is the loss for the t′th auxiliary task, and λm and λt are the associated weights. Y=(ym, y1, y2, . . . , yT) is a vector composed by real values of different tasks and I is the input vector for all tasks.
In operation 840, the future commodity consumption for the target object may be forecasted using the MTL-LSTM model obtained in operation 830. As discussed above, the MTL-LSTM model may optimize forecasting accuracy for the target home.
The future commodity consumption for the target object may be forecasted by inputting present data corresponding to the commodity consuming objects of the cluster into the MTL-LSTM model obtained in operation 830. An output of the MTL-LSTM model may then be used in the forecasting future commodity consumption of the target object.
Algorithm 1 above may be adapted to an algorithm (“Algorithm 2”) for the single-object load forecasting. Algorithm 2 may have the same general structure as Algorithm 1, but with the following modifications. In the clustering stage, the target house may be assigned to one of the clusters. The other houses in the same cluster as the target house may be used as auxiliary houses for the target house to assist the target house's load forecasting. In the multi-task learning stage, the multi-task learning may be implemented based on the overall loss which is composed by the loss function defined in Equation 2. The weights for the different tasks may be hyperparmerters and can be determined by checking the validation set's performance.
The foregoing disclosure provides illustration and description, but is not intended to be exhaustive or to limit the implementations to the precise form disclosed. Modifications and variations are possible in light of the above disclosure or may be acquired from practice of the implementations.
As used herein, the term “component” is intended to be broadly construed as hardware, firmware, or a combination of hardware and software.
It will be apparent that systems and/or methods, described herein, may be implemented in different forms of hardware, firmware, or a combination of hardware and software. The actual specialized control hardware or software code used to implement these systems and/or methods is not limiting of the implementations. Thus, the operation and behavior of the systems and/or methods were described herein without reference to specific software code—it being understood that software and hardware may be designed to implement the systems and/or methods based on the description herein.
Even though particular combinations of features are recited in the claims and/or disclosed in the specification, these combinations are not intended to limit the disclosure of possible implementations. In fact, many of these features may be combined in ways not specifically recited in the claims and/or disclosed in the specification. Although each dependent claim listed below may directly depend on only one claim, the disclosure of possible implementations includes each dependent claim in combination with every other claim in the claim set.
This application is based on and claims priority under 35 U.S.C. § 119 to U.S. Provisional Patent Application No. 63/133,078, filed on Dec. 31, 2020, in the U.S. Patent & Trademark Office, the disclosure of which is incorporated by reference herein in its entirety.
Number | Date | Country | |
---|---|---|---|
63133078 | Dec 2020 | US |