The present invention relates to a method for controlling an electrical microgrid. The invention further relates to an associated computer program method with such method.
One of the challenges of our century is to reduce greenhouse gas emissions. To meet such challenge, many investments relate to the development of renewable energies and distributed energy resources (DER). As renewable energy sources, such as solar and wind energy sources, have stochastic character, electricity grid infrastructures need to be adapted in order to maintain the reliability and the stability of the electricity grid.
To this end, microgrids used for the integration of renewable energy sources into electricity grids, have been developed. A microgrids is a power grid which includes renewable energy sources (wind turbines or photovoltaic panels), traditional fossil energy sources (diesel generator), energy storage devices (batteries), energy-consuming loads and an energy management system. A microgrid operates either connected to or disconnected from the main grid in isolated mode. A microgrid is also suitable for being completely disconnected from the main grid (off-network).
One of the elements making the operation of a microgrids possible, is the energy management system of the microgrid.
In particular, energy management systems are known, which are based on a prediction module over the next hours, of the power produced by the renewable energy sources (photovoltaic panels) and the consumption of loads. The different units of the grid are then managed according to an optimization method using the predictions of the module.
However, such a prediction module is not suitable for coping with changing and unforeseen conditions. Furthermore, same is complex to implement.
Other energy management systems based on machine learning (training models) have also been developed. Such systems are used for the control of microgrids for which same have been trained.
However, training such models is time-consuming and resource-intensive, making such solution complex to deploy on a large scale.
Still other means of managing microgrids are presented in the documents US 2017/194814 A and CON 112 117 760 A. The article M. Rawa et al., “An Efficient Scheme for Determining the Power Loss in Wind-PV Based on Deep Learning,” in IEEE Access, vol. 9, pp. 9481-9492, 2021, doi: 10.1109/ACCESS.2020.3046687 describes a method using deep learning for determining power losses in wind and solar energy systems.
There is thus a need for a tool which would facilitate the control of different microgrids, while eliminating the need for a prediction module.
To this end, the subject matter of the present description is a method for controlling at least one electrical microgrid, each electrical microgrid comprising at least one electrical energy consumption element, at least one electrical energy production element and at least one electrical energy storage element, each microgrid being suitable for assuming a plurality of energy states, each energy state being defined by a quantity of electrical energy to be exchanged between elements of the microgrid and by a quantity of electrical energy stored on the at least one electrical energy storage element, each microgrid being apt to switch from one state to another by the implementation of an action on the microgrid among a set of predefined actions, the method comprising the phases of:
According to other particular embodiments, the method comprises one or more of the following features, taken individually or according to all technically possible combinations:
The present description also relates to a computer program product comprising a readable storage medium, on which is stored a computer program comprising program instructions, the computer program being loadable on a data processing unit and implementing and suitable for leading to the implementation of the method such as described hereinabove when the computer program is implemented on the data processing unit.
The present description further relates to a readable information medium on which is stored a computer program product such as described hereinabove.
Other features and advantages of the invention will appear upon reading hereinafter the description of the embodiments of the invention, given only as an example, and making reference to the following drawings:
An example of microgrid 10 is illustrated in
The microgrid 10 is apt to assume a plurality of energy states St. Each energy state St is defined by a quantity of electrical energy to be exchanged PNet between elements of the microgrid 10 and by a quantity of electrical energy stored EBCap on the at least one electrical energy storage element 19.
For example, the quantity of electric energy to be exchanged PNet is the difference between the quantity of electric energy produced PPV by the at least one renewable energy production element 18 and the quantity of electric energy demanded PC by the at least one electric energy consumption element 14. The quantity of electrical energy to be exchanged PNet is, in such case, a quantity of electrical energy to be exchanged between the elements of the microgrid 10 with the exception of the at least one renewable energy production element 18.
The microgrid 10 is apt to switch from one state St to another by implementing an action At on the microgrid 10 from a set EA of predefined actions.
For example, set EA of predefined actions includes at least one of the following actions:
The microgrid 10 is suitable for operating in a given environment, among a set of predefined environments. The environment is e.g. a given geographical area.
The environment influences in particular the quantity of electrical energy to be exchanged PNet. E.g. the environment influences at least one of the quantities of electric energy produced PPV by the at least one renewable energy production element 18 and the quantity of electric energy demanded PC by the at least one electric energy consumption element 14.
A predefined environment refers e.g. to a set of environments having similar profiles in terms of the quantity of electrical energy PPV produced by the at least one renewable energy production element 18 and the quantity of electrical energy PC demanded by the at least one electrical energy consumption element 14.
For example, for two 10 microgrids operating in distinct environments:
The microgrid 10 is suitable for operating according to a given operating mode, among a set of predefined operating modes. The operating mode advantageously relates to whether or not the microgrid 10 is connected to an electrical power distribution grid (main electrical grid). The operating mode of a microgrid 10 defines in particular the actions At suitable for being implemented on the microgrid 10 among the set EA of predefined actions.
Advantageously, the predefined operating modes comprise at least one of the following operating modes, preferentially the following three operating modes:
In particular, for the isolated mode or the intermediate mode operating in isolation, the actions A4, A5 and A6 are not possible because the microgrid 10 is not connected to an electrical power distribution grid. On the other hand, for the connected mode or the intermediate mode operating in connected mode, all the actions A1 to A7 are possible.
The electric power transmission grid 12 is configured for receiving the electric power produced or stored by the elements connected to said electric power transmission grid 12 and for distributing the received electric power to the elements connected to said electric power transmission grid 12.
The connection between each element and the electrical energy transmission grid 12 is e.g. established by a “machine to machine” protocol.
Each element of the microgrid 10 is suitable for being connected or disconnected from the electrical power transmission grid 12.
An electrical energy consumption element 14 is an element apt to consume electrical energy. An electrical energy consumption element 14 is e.g. an electrical lighting or heating network of a commercial or residential building, an electric vehicle, or further operational equipment.
A fossil energy production element 16 is an element apt to produce fossil energy. Fossil energy is produced from the sedimentary decomposition of organic matter, i.e. composed mainly of carbon. A fossil energy production element 16 uses primary resources such as oil, natural gas or coal. A fossil energy production element 16 is e.g. a coal power plant, a fuel oil power plant, a gas power plant or a diesel generator.
A renewable energy production element 18 is an element apt to produce renewable energy. A renewable energy is a source of energy coming from cyclic or constant natural phenomena induced e.g. by the stars: the Sun mainly for the heat and light the Sun generates, but also the attraction of the moon (tides) and the heat generated by the Earth (geothermal). A renewable energy production element 18 is e.g. a hydroelectric dam, a hydroelectric power plant, a set of wind turbines or a set of solar panels.
An electrical energy storage element 19 is an element apt to store electrical energy. An electrical energy storage element 19 is e.g. an electrical energy accumulator such as a battery. An electrical energy storage element 19 works as a generator of electrical energy when discharging, and as a consumer of electrical energy when charging.
The tool 13 is configured for controlling the quantities of electrical energy exchanged between the elements of the microgrid 10.
In the example illustrated in
The calculator 20 is preferentially a computer.
More generally, the calculator 20 is an electronic calculator suitable for manipulating and/or transforming data represented as electronic or physical quantities in registers of the calculator 10 and/or memories into other similar data corresponding to physical data in memories, registers or other types of display, transmission or storage.
The calculator 20 interacts with the computer program product 22.
As shown in
The computer program product 22 includes a storage medium 36.
The storage medium 36 is a medium readable by the calculator 20, usually by the data processing unit 26. The readable storage medium 36 is a medium suitable for storing electronic instructions and apt to be coupled to a bus of a computer system.
As an example, the storage medium 36 is a USB key, a diskette or a floppy disk, an optical disk, a CD-ROM, a magneto-optical disk, a ROM, a RAM, an EPROM, an EEPROM, a magnetic card or an optical card.
The computer program 12 containing program instructions is stored on the storage medium 36.
The computer program 22 can be loaded into the data processing unit 26 and is suitable for the implementation of a method for controlling the microgrid 10 when the computer program 22 is implemented on the processing unit 26 of the calculator 20. Such a control method will be described hereinafter in the description.
The operation of the control tool 13, i.e. of the calculator 20 in interaction with the computer program product 22 will now be described with reference to
The control method comprises a phase 100 of providing a model, called source model MS, trained on a source domain DS for learning a source set of tasks TS aimed at controlling a given microgrid, called source microgrid 10S. The term “domain” refers to a space of input characteristics and a marginal probability distribution. The term “task set” refers to an output feature space and an objective prediction function.
More particularly, the source model MS was trained for determining an action, among the set EA of predefined actions (e.g. described hereinabove), for controlling the source microgrid 10S, depending on the state St of the source microgrid 10S.
The source microgrid 10S is suitable for operating in a given environment, called source environment ES, and according to a given operating mode, called source operating mode FS. The source environment ES delimits the source domain DS. The source operating mode FS delimits the source set of tasks TS.
The source model MS comprises in particular, parameters w the values of which are optimized for the source domain DS and the source set of tasks TS. In one example, the source model MS is a neural network comprising an input neural layer CE, an output neural layer CS and intermediate neural layers Cint. The parameters w of the source model MS then define the synaptic weights P between the neurons of consecutive layers. Examples of neural networks are illustrated in
In particular,
The control method comprises a phase 110 of providing a model, called target model MC, suitable for being trained on a target domain DC for learning a target set of tasks TC, aimed at controlling a given microgrid, called target microgrid 10C.
More particularly, the target model MC was trained for determining an action At, from among the set EA of predefined actions (e.g. described hereinabove), for controlling the target microgrid 10C, depending on the state St of the target microgrid 10C.
The target microgrid 10C is suitable for operating in a given environment, called target environment EC, and according to a given operating mode, called target operating mode FC. The target environment EC delimits the target domain DC. The target operating mode FC delimits the target set of tasks TC.
The target microgrid 10C differs from the source microgrid 10S in that:
The target model MC comprises parameters w suitable for being optimized for the target domain DC and the target set of tasks TC. When the source model MS is a neural network (see example above), the target model MC is also a neural network comprising an input neural layer CE, a layer of output neurons CS and intermediate layers of neurons Cint. The parameters w of the target model MC then define the synaptic weights P between the neurons of consecutive layers.
The control method comprises a phase 120 of extraction of parameter values w from the source model MS. The extraction phase 120 is implemented by the calculator 20 in interaction with the computer program product 22, i.e. is implemented by computer.
In one embodiment, the parameter values w extracted from the source model MS, define at least the synaptic weights P between the neurons of the input layer CE and of the intermediate layer Cint of neurons consecutive to the input layer CE, so-called first intermediate layer. Preferentially, the parameter values w extracted from the source model MS also define the synaptic weights P between the neurons of a plurality of intermediate layers Cint of neurons, consecutive to the first intermediate layer. In the example illustrated in
The control method comprises a phase 130 of initialization of parameters w of the target model MC with the parameter values w extracted from the source model MS, so as to obtain an initialized target model MC. The initialization phase 130 is implemented by the calculator 20 in interaction with the computer program product 22, i.e. is implemented by computer.
Thereby, when the models MS, MC are neural networks, the synaptic weights P between the layer neurons of the target model MC, are initialized with the values of the synaptic weights P corresponding to said layers in the source model MS. In the example illustrated in
In one embodiment, at least one parameter w of the target model MC which was initialized with an extracted value, is frozen. In one variant, the above applies to all parameters w of the target model MC which were initialized with extracted values. In other words, the above means that the values of the parameters w cannot be subsequently modified, in particular during the optimization phase described hereinafter.
In a variant, all the parameters w of the target model MC, even the initialized parameters, can be modified during the optimization step.
The control method comprises a phase 140 of optimization, depending on the target domain DC and on the target set of tasks TC, the parameters w of the target model MC being initialized for obtaining a target model MC trained for the control of the target microgrid 10C. The optimization phase 140 is implemented by the calculator 20 in interaction with the computer program product 22, i.e. is implemented by computer.
In one example, the optimization phase 140 comprises steps 140A of generation of training data and of steps 140B of training the target model MC based on the generated training data. The generation 140A and training 140B steps are repeated in successive iterations.
During the 140A generation stages, a model to be trained (agent) interacts with an environment according to the principle of Deep Reinforcement Learning. The model to be trained is e.g. a neural network.
In particular, as illustrated in
The replay memory MR is typically initialized at startup, i.e. at the start of the very first generation step 100. Once the maximum capacity of the replay memory MR is reached, the replay memory MR then works e.g. according to the “First-In First-Out (FIFO)” model.
In the present case, the E environment was configured for simulating the operation of a target microgrid 10C. E.g. the simulation was carried out according to the principle of a Markovian decision process. The successive interactions between the target model MC to be trained and the environment E will be used for obtaining a target model MC trained for the control of a target microgrid 100.
An example of the implementation of the different steps of the generation phase is given hereinafter.
The step 140A aims to generate a set of training data depending on the target domain DC and on the target set of tasks TC.
The generation step 140A comprises a sub-step 140A-1 for the reception of initial data or of data coming from a preceding iteration. Such data are specific to the target domain DC.
In an example of implementation, the data received, whether initial or coming from a preceding iteration, comprise a set of predetermined values of quantities of electrical energy to be exchanged PNet and a set of possible initial values of quantity of electrical energy EBcap stored on the at least one electrical energy storage element 19.
The values of quantities of electrical energy to be exchanged PNet were predetermined e. g. for each time step of a predefined period of time. The predefined period of time is e.g. one year and the time steps are one hour.
Each value of the quantity of electrical energy to be exchanged PNet for a time step is e.g. the difference between the value of the electric energy PPV produced by the at least one renewable energy production element 18 for said time step and the value of the electric energy demanded PL by the at least one electric energy consumption element 14 for said time step. Thereby, one has:
P
Net(t)=PPV(t)−PL(t) (1)
The values of electric energy PPV produced by the at least one renewable energy production element 18 and of electric energy PL demanded by the at least one electric energy consumption element 14, were predetermined e.g. for each time step of the predefined period of time. Such values are e.g. derived from measurements carried out by sensors on existing installations or were randomly generated beforehand.
The possible initial values of the quantity of electrical energy initially stored EBcap on the at least one electrical energy storage element 19, are predefined values. The possible values are e.g. 0 kilowatt hours (kWh), 5 kWh and 10 kWh.
In the same example of implementation, when the received data come from a preceding iteration, the received data comprise at least one of the following:
The generation step 140A comprises a sub-step 140A-2 of obtaining, from the received data, a current model suitable for determining an action At for controlling a microgrid 10C, among a set EA of predefined actions, depending on a state St of the microgrid 10C.
In an example of implementation, the current model is the initialized target model received when the data are initial data and is, otherwise, the optimized model during the last iteration.
The set EA of predefined actions is e.g. as defined above. The possible actions At are in particular set by the target set of tasks TC.
The generation step 140A comprises a sub-step 140A-3 for determining, from the received data, a current time step Δt.
In an example of implementation, the current time step Δt is:
The generation step 140A comprises a sub-step 140A-4 for obtaining, from the received data, a current state St of a microgrid 10.
In one embodiment, the current state St is either an initial state S0 when the current time step Δt is an initialized time step Δt0, or a following state St+1 obtained during the last iteration.
When the current state St is an initial state S0, the initial state S0 is defined by the predetermined value of the quantity of electrical energy to be exchanged PNet corresponding to the current time step (first time step Δt0) and by a stored electrical energy quantity value EBcap chosen randomly from the set of possible values of quantity of electrical energy stored.
The generation step 140A comprises a sub-step 140A-5 of determination, by the current model, of an action At for controlling the microgrid 10 depending on the current state St according to a learning technique. The learning technique is e.g. a Q-Learning or a Double-Q-Learning technique, such as the Epsilon greedy technique.
The generation step 140A comprises a sub-step 140A-6 of verification that constraints predetermined by the action At determined depending on the current state St, are satisfied.
In one implementation mode, the predetermined constraints comprise at least one constraint selected from the following set of constraints:
P
B(t)+PG(t)+PC(t)=PNet(t) (2)
E
Bcap(t)=EBcap(t−1)−PB(t)·Δt (3)
The generation step 140A comprises a sub-step 140A-7 of determination of a reward Rt representative of the operational cost induced following the execution of the action At and an indicator indicating whether the next state St+1 obtained following the execution of the action At is a final state.
The reward Rt is representative of the operational cost induced following the execution of the action At.
In one embodiment, the reward Rt determined for each training datum is equal to the quantity of electrical energy to be exchanged PNet multiplied by a multiplicative coefficient selected from a set of multiplicative coefficients m, q, c depending on the action At determined. The multiplicative coefficients m, q, c represent the operational costs of at least one electrical energy storage element 19, of at least one fossil energy production element 16, and of the load curtailment, respectively.
For example, the reward Rt is equal to:
In an example of implementation, the reward Rt is calculated depending on a cost function that is sought to be minimized. The goal is to obtain a trained model minimizing the operational costs of the target microgrid 10C while satisfying predetermined constraints over the period of time T. In one example, the costs induced by the at least one renewable energy production element 18 are not included in the cost function and a fixed cost is assumed for the at least one fossil energy production element 16 and the at least one electrical energy storage element 19. In the present example, the objective function is thereby, the sum of the cumulative costs for operating the at least one fossil energy production element 16 and the at least one electrical energy storage element 19 over the period of time T with a set time step (e.g. 1 hour). To simplify, it is assumed e.g. that the electrical power at time t is the power during the interval [t, t+Δt]. The cost function is then formulated as follows:
J
obj=Σt=0T|PB(t)|·m+|PG(t)|·q+|PC(t)|·c (4)
Where:
In an example of implementation, the indicator indicating whether the next state St+1 obtained following the execution of the action At is a final state, is determined depending on the current time step Δt and of the verification carried out in the preceding step. Thereby, the final state is e.g. reached:
When the current time step Δt is not equal to the predetermined time step and when the constraints are verified, the following state St+1 obtained is not a final state.
A learning datum comprising at least the current state St, the following state St+1, the determined action At and the reward Rt, then being stored in the replay memory MR, and advantageously a Boolean variable indicating whether the following state obtained is or is not a final state.
The generation step 140A comprises the repetition of the preceding sub-steps (140A-1 to 140A-7 of the generation step 140A) as long as the indicator indicates that the next obtained state St+1 is different from a final state. The set of learning data stored until a final state is reached forms a learning set.
Once the final state is obtained, the training step 140B is started.
The training step 140B is a training phase of the current model wherein at least one parameter w of the current model is optimized based on at least one training set stored in the replay memory MR, for obtaining an optimized model. The training technique used is e.g. based on a deep learning algorithm.
In one mode of implementation, only the non-frozen parameters w of the current model are optimized during the training step 140B.
Advantageously, the at least one parameter w of the model is optimized on the basis of a plurality of learning sets stored in the replay memory MR.
The control method then comprises the repetition of the generation 140A and the training 140B steps until a convergence criterion is met, the model optimized during the last iteration being a model trained for the control of a target electrical microgrid 10C, also called control model.
For example, the convergence criterion is reached when, during a predetermined number of successive iterations, each time a final state is obtained, the current time step Δt which allowed the final state to be obtained, corresponds to a predetermined time step (e.g. the last time step of the predetermined values of quantities of electrical energy to be exchanged PNet), and the sum of the rewards Rt obtained for each training datum of the corresponding training set, being greater than or equal to a predetermined threshold. Thereby, when the convergence criterion is reached, it is considered that the cost function is minimized.
The control method comprises a phase 150 of use of the control model comprising the determination of a control action At of the target microgrid 10C following the reception, by the control model, of the current state St of the target microgrid 10.
A person skilled in the art will understand that the control model was first validated in a conventional manner on test data different from the data of the training set, before being used for the effective control of a target microgrid 10C. The validation consists e.g. in the implementation of the generation step 140A with different input data.
The control method comprises a phase 160 of carrying out the action At determined by sending commands to the elements of the target microgrid 10C. The commands are e.g. commands for connecting or disconnecting the elements of the target microgrid 10C of the electrical energy transmission grid 12 and/or commands for charging, discharging or producing electrical energy. Depending on the case, an At action can also be the absence of commands (corresponding to the action of not doing anything).
Thereby, the control model obtained following the implementation of the present method minimizes the operational costs of the microgrid. Such a model also dispenses with a prediction module. Same can thus be easily adapted to all types of microgrid.
Furthermore, such a control model is obtained more quickly since data resulting from the learning of another model are reused. The present method thereby offers the possibility of leveraging on learning carried by other models for microgrids having different environments and/or operating modes.
The present method is thereby perfectly suited for being implemented in a large number of microgrids since the time required for obtaining an optimized model is significantly reduced.
A person skilled in the art will understand that the embodiments and variants described above can be combined so as to form new embodiments provided that same are technically compatible.
Number | Date | Country | Kind |
---|---|---|---|
FR2014141 | Dec 2020 | FR | national |
The present application is a U.S. National Phase application under 35 U.S.C. § 371 of International Patent Application No. PCT/EP2021/087590 filed Dec. 23, 2021, which claims priority of French Patent Application No. 2014141 filed Dec. 24, 2020. The entire contents of which are hereby incorporated by reference.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/EP2021/087590 | 12/23/2021 | WO |