The invention relates to a method for charging of electric vehicles, particularly of at least two electrically driven vehicles entering a charging area of a vehicle fleet, a computer program, a computer readable medium, a control unit and a battery charging system.
The invention can be applied in heavy-duty vehicles, such as trucks, buses and construction equipment. Although the invention will be described with respect to a heavy-duty vehicle, the invention is not restricted to this particular vehicle, but may also be used in other vehicles such as a car.
For electrically driven vehicles for example after a mission, it has to be decided whether or not they are charged, if they approach a charging area. For example in U.S.2018290546 A1 systems and methods of charging a fleet of electric vehicles are described including a control system that may modify the charging scheme to a bus, for instance, if the state of charge (SOC) of the bus and/or the distance to the next charging station is such that a charging event cannot be postponed. However further improvements especially with respect to a vehicle fleet are needed. In particular rules have to be established to decide on charging of more than one vehicle.
An object of the invention is to overcome problems relating to charge scheduling for electric vehicles of a vehicle fleet, in particular problems when there are more vehicles to charge than charging nodes.
According to a first aspect of the invention, the object is achieved by a method according to claim 1. In particular, a method for controlling the charging of at least one electrically driven vehicle of a vehicle fleet comprising a plurality of vehicles, the method is characterized by the steps of:
The invention is based on the recognition that problems may occur, if one vehicle blocks an after coming vehicle from charging. This includes the recognition, that this scenario is in particular critical if the distance between the vehicles is small and the second vehicle starts to run out of battery energy. The target of this invention is therefore to propose how a vehicle entering a charging area shall take a decision that is a balance between the interests of its own and other vehicles. Thus the invention allows to establish a balance between the interests of several vehicles of a vehicle fleet by taking into account not only information on the state of charge of more than one vehicle but additionally information on a distance between two vehicles.
According to one embodiment, the decision not charging the first vehicle comprises charging the second vehicle. Thus the second vehicle is charged instead of the first vehicle. In further cases after performing the method with the second and a third vehicle, the decision could be to charge the third vehicle instead of the first or the second vehicle.
Preferably the state of charge information comprises information to
The state of charge information thus in some embodiments not only provides information on the state of charge but also on a minimum necessary, which in any case should not be undercut. Thus it is possible to prioritize a vehicle which is on the verge to undercut the minimum or which is already below the minimum.
The received state of charge information influence the decision for example as described in the following:
A small difference in space or time between the first and second vehicle may decrease the likelihood for charge decision because lead vehicle may block after coming vehicle from charging. A low value of the state of charge of the first vehicle shall increase the likelihood for charge decision. If the difference of the states of charge of the first and the second vehicle is positive, this decreases the likelihood for charge decision because the following vehicle may be in higher need of charging. A negative value may increase the likelihood for charge decision because also state of charge balancing is desired. Balancing here means similar state of charge values between vehicles.
A further influence on the decision can be, if no charging or not-charge node in the charging area is available. In particular, if there is no free charging node, this may decrease the likelihood for charge decision because it will trigger an unnecessary waiting. If there is no free not-charge node, this may decrease the likelihood for not-charge decision because it will trigger an unnecessary waiting.
Preferably the further step of outputting a signal characterizing the charging decision is included, such that the decision can be outputted to a control unit of a vehicle or of the charging area.
In a preferred embodiment an iterative self-learning method of improvement for making of said charging decision is included. The idea of this embodiment is to formulate an artificial agent that from a set of states takes a decision on whether to charge or not. Examples of 30 relevant states are for example the state of charge information or the distance between the first and the second vehicle. The policy for taking an action is thus based on learning in a virtual or real environment. This learning can be formulated in such a way that for example the operating costs are minimized. In addition, it is preferred that the charging decision policy shall be punished if the result is that some vehicle violates a battery charge constraint.
Such a charge constraint can mean that all vehicles shall have a battery charge level above a predefined limit.
Preferably the self-learning method of improvement is defined by a self-learning charging decision algorithm, comprising the steps of
In this embodiment at each time step, based on information on state of charge and distance a charging decision is made. One time step later, in part as a consequence of the previous decision action, a numerical reward is given. Depending on the reward value now the algorithm is adapted to further improve upcoming decisions.
The self-learning charging algorithm can be trained either in real world or off-line, especially with a simulation. It is further preferred that the algorithm is firstly trained from an off-line simulation and in the following further refined in real world during operation of the fleet in order to have an initially trained algorithm when starting operation of the algorithm in the vehicle of the fleet.
Preferably, the reward function comprises at least one penalty function for a constraint violation. With the penalty function, constraint violation can be punished, preferably in a way that a constraint violation overrules other parts of the reward function. Thus the self-learning charging decision algorithm is pushed to decisions, which do not result in constraint violation.
In a further embodiment the penalty function depends on a number of constraint violations of the first vehicle and/or the second vehicle and/or each vehicle of the vehicle fleet. Thus the more constraint violations are caused by a decision the higher the penalty can be.
The constraint is preferably a minimum state of charge for the first vehicle and/or the second vehicle and/or each vehicle of the fleet and/or the constraint is that no charging command should be given if at the charging area no free charging node is available. Thus for example the following situations can be regarded as that the operation of a fleet has failed: At least one of the vehicles has a (too) low state of charge; too low may for example be lower than 20%, as the power capability starts to decrease rapidly. Or a vehicle is commanded to charge despite the charge node is occupied. The minimum state of charge may be a global minimum for every vehicle of fleet or a specific minimum for, for example, different types of vehicle or different vehicles.
In a further embodiment the reward function considers an amount of charging energy required for charging the first and/or second vehicle and/or each vehicle of the vehicle fleet, and/or a mission time of the first vehicle and/or the second vehicle and/or each vehicle of the vehicle fleet. This has the advantage that also charging energy and thus charging time and also mission time are considered and thus influence further decisions.
Preferably the reward function depends on operating costs, in particular operating costs of the whole fleet. More preferably the operating costs are determined considering charging costs, battery degradation costs, hardware value depreciation costs, societal costs and/or salary costs for the first vehicle and/or charging costs, battery degradation costs, hardware value depreciation costs, societal costs and/or salary costs for the second vehicle. Thus the costs of the vehicle fleet can be optimized by using the self-learning charging decision algorithm.
In further embodiments also a revenue for the first and second vehicle is taken into account in the reward function. The revenue is preferably determined considering a number of moved goods of the first vehicle and/or the second vehicle and/or a value of the moved goods.
It is further preferred that the reward value at a time stamp is defined as a gap in operation costs between the time stamp and a time prior or at the prior charging decision and minus a penalty.
According to a second aspect of the invention, the object is achieved by a computer program according to claim 14. The computer program comprises program code means for performing the steps of any of any of the embodiments of the method according to the first aspect of the invention when said program is run on a computer. Furthermore, the object is achieved by the provision of a computer readable medium carrying a computer program comprising program code means for performing the steps of any of the embodiments of the method according to the first aspect when said program product is run on a computer.
According to a third aspect of the invention, the object of the invention is achieved by a charging control unit according to claim 16. The charging control unit for controlling the charging of at least one electrically driven vehicle of a vehicle fleet comprising a plurality of vehicles is configured to perform the steps of the method according to the first aspect of the invention.
In a further embodiment the charging control unit is a centralized control unit for all vehicles of the vehicle fleet.
According to a fourth aspect the invention relates to battery charging system according to claim 18, which comprises:
Preferably the charging control unit and/or the charging area control unit and/or the vehicle control unit are connected with one another for communication.
In an embodiment of the battery charging system the charging control unit is integrated in the charging area control unit or in the vehicle control unit.
Further advantages and advantageous features of the invention are disclosed in the following description and in the dependent claims.
With reference to the appended drawings, below follows a more detailed description of embodiments of the invention cited as examples.
In the drawings:
The reward function executed in step S4 in order to determine the reward value comprises in this embodiment two penalty function for a constraint violation. The first penalty function is penalty function for a first constraint being a minimum state of charge for each vehicle of the fleet. For example the minimum state of charge may be 20%. The second penalty function is a penalty functions for a second constraint, in that no charging command should be given if at the charging area no free charging node is available. Another possibility is penalty function for a constraint, that no not charging command should be given, if at the charging area no free no-charge node is available.
The reward function in this embodiment further depends on operating costs. The operating costs are determined considering charging costs, battery degradation costs, hardware value depreciation costs, societal costs and/or salary costs for the first vehicle and/or charging costs, battery degradation costs, hardware value depreciation costs, societal costs and/or salary costs for the second vehicle. Lower costs are preferred. The reward functions thus punishes constraint violation and remunerates low costs.
30 In the following a further example of a method including a self-learning charging decision algorithm is described schematically.
In the following the learner and decision-maker is called the agent. The thing it interacts with, comprising everything outside the agent, is called the environment.
More specifically, the agent and environment interact at each of a sequence of discrete time steps, t=0, 1, 2, 3, . . . At each time step t, the agent receives some representation of the environment's state, St and on that basis selects an action, At ∈A (St), where A (St) is the set of actions available in state St. One time step later, in part as a consequence of its action, the agent receives a numerical reward, Rt+1∈Rt, and finds itself in a new state, St+1. According to the invention self-learning is used with the ambition to avoid operation failure, e.g. some vehicle has to low state of charge (SoC).
10 The state space used is
s=(SoCa,ΔSoC,rg) (1)
where
ΔSoC=SOC
a−
SoC
b (2)
SoCa is the state of charge of the first vehicle, and SoCb is the state of charge of the second vehicle. rg means the relative gap (in time or space) between the first and the second vehicle.
There are two actions, which can be applied after the decision:
During an episode, actions will be requested or taken at specific time instants. These are denoted ti. These time events also correspond to when the learning agent will receive reward feedback. The reward for an action at time ti will be returned at time ti+1. The reward is defined as
R
i+1=−peni+1 (4)
where
pen
i+1=P·socfail (5)
socfail=(min(
In this embodiment the reward is solely based on the penalty function for violating a constraint for a minimum state of charge. If any vehicle of the fleet infringes the constraint, the reward is negative.
Another embodiment considers also the costs besides the penalty function:
R
i+1=−Δcoper−peni+i (7)
where
Δc
oper=coper
This is the change of total cost of running the fleet. If no vehicle are charged between ti and ti+1Δcoper is zero.
c
oper=celec·Echarge+cchrent·tcharge (9)
These are the total costs of running the fleet. Herein taking into account the charging costs and rental cost for a charging slot. It increases as soon as some vehicle is charged. For example coper is increased first after a charging is finished. Initially the charging slot rental cost cchrent can be set as zero. Furthermore a term for battery degradation can be added.
pen
i+1=P·socfail (10)
socfail=(min(
With this embodiment also costs are taken into account in order to improve the charging decision.
Training of the agent, in other words training of the algorithm starts preferably as training from off-line simulation in order to have an initially trained algorithm when starting operation of the algorithm in the vehicle of the fleet. Then the training of the algorithm is continued in real world during operation.
The charging control unit 300 in the shown embodiment furthermore is able to perform a self-learning method of improvement, which is defined by a self-learning charging decision algorithm, comprising the steps of making a charging decision, determining a reward value at a time stamp after making the charging decision based on the charging decision by executing a reward function; adapting the self-learning charging decision algorithm depending on the determined reward value and subsequently applying the adapted self-learning charging decision algorithm in a subsequent making of the charging decision related to the same vehicles or to different vehicles. The reward function preferably considers total costs of operating the fleet as well as a penalty function for a constraint violation, if one vehicle of the fleet has a state of charge lower than a defined minimum state of charge for all vehicles of the fleet.
It is to be understood that the present invention is not limited to the embodiments described above and illustrated in the drawings; rather, the skilled person will recognize that many changes and modifications may be made within the scope of the appended claims.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/EP2019/062989 | 5/20/2019 | WO |