DYNAMIC MULTIPLE BI-DIRECTIONAL SUPPLY AND DEMAND MATCHING FOR EV CHARGING

Information

  • Patent Application
  • 20240059170
  • Publication Number
    20240059170
  • Date Filed
    August 16, 2022
    a year ago
  • Date Published
    February 22, 2024
    2 months ago
Abstract
A method for dynamically matching energy demand of a population of electric vehicles (EVs) with energy supply of a population of charging stations within a geofenced perimeter includes receiving, via a cloud-based server, EV information from each respective EV, and receiving charging station information from each respective charging station. The method includes generating an SoC map and a charging station power map from the EV information and the charging station information, respectively, and predicting the energy supply and demand using the maps. The server dynamically matches the EVs to at least one of the charging stations or vice versa using a reward function, the predicted energy supply, and the predicted energy demand, including generating a rank-ordered listing for each of the EVs and/or each of the charging stations in a manner that maximizes an expected discounted future reward.
Description
INTRODUCTION

Electrochemical battery cells and battery packs constructed from such battery cells are used as direct current (DC) power supplies in a myriad of high-power battery electric systems. An electric vehicle is an exemplary type of battery electric system using a high-voltage propulsion battery pack constructed from an application-suitable number of cylindrical, prismatic, or pouch-style battery cells. The battery pack, which is connected to a DC voltage bus, ultimately powers one or more electric propulsion motors and associated power electronic components during battery discharging modes. During battery charging modes, a charging current is provided to the constituent battery cells of the battery pack from an offboard charging station. Thus, the charging process, when performed away from a user's home charging station, often entails locating an available charging station and scheduling a charging time.


While charging infrastructure continues to grow and evolve, drivers of electric vehicles face potential uncertainty regarding actual availability of a given charging station at a particular desired location or charging time. In order to reduce range anxiety, a driver may use an application (“app”) to help locate charging stations along a planned travel route, and to schedule a charging session at intervals along the way. However, existing “one-to-one” approaches for scheduling charging sessions, i.e., one electric vehicle seeking an open charging slot at a given charging station, may be suboptimal in terms of balancing energy supply and demand over a wider geographical area.


SUMMARY

Disclosed herein are cloud-based methods and systems for performing dynamic optimal matching of electric vehicle (“EV”) charging demand (“EV demand”) with electric vehicle supply equipment (EVSE) charging supply levels (“EVSE supply”) within a geofenced area of interest (“AOI”). In particular, the computer-implemented solutions described in detail below provide for optimized bidirectional matching of multiple EV demands with available EVSE supplies within the geofenced AOI. Using the disclosed cloud modeling strategies, for instance, a driver of an EV may benefit from dynamic supply prediction over many different charging stations, and dynamic demand prediction over many different EVs. Dynamic bidirectional matching of multiple charging stations to multiple demands (“many-to-many”) as opposed to the aforementioned “one-to-one” matching is likewise enabled using a reinforced learning approach, e.g., using a state-of-charge (SoC) map and a charging station power map in a possible embodiment. The present teachings may enable a recommendation engine and automatic enabler for charging spot bookings, pre-conditioning activation, adaptive routing, charging station scheduling, load balancing, power sharing, and dynamic pricing, among other possible attendant benefits.


In a possible implementation, a method for dynamically matching an energy demand of a population of EVs with an energy supply of a population of charging stations within a geofenced perimeter includes receiving, via a cloud-based server, a set of EV information from each respective one of the EVs, and receiving, via the cloud-based server, a set of charging station information from each respective one of the charging stations;. The method further includes generating a state of charge (SoC) map and a charging station power map from the set of EV information and the set of charging station information, respectively, via the cloud-based server. In this embodiment, the method also includes predicting the energy supply and the energy demand as a predicted energy supply and a predicted energy demand, respectively, via the cloud-based server using the SoC map and the charging station power map. The EVs are then dynamically matched to at least one of the charging stations using a reward function, the predicted energy supply, and the predicted energy demand, including generating a rank-ordered listing for each respective one of the EVs and/or each respective one of the charging stations in a manner that maximizes an expected discounted future reward.


The geofenced perimeter may be a dynamically-adjustable polygon, with the method including using a pointer network to define the geofenced perimeter as a convex hull or minimum convex polygon.


Dynamically matching the EVs to at least one of the charging stations may include generating a table of ranked bidirectional matches, the table having a plurality of rewards including an estimated time of arrival (ETA) and an estimated time to charge (ETC) at a respective one of the charging stations. The plurality of rewards may include one or more of an estimated cost of charge (ECC), an estimated expected charge (ECC), or an estimated charging station profit (SCP).


In some aspects of the disclosure, dynamically matching the EVs to at least one of the charging stations includes calculating a combined reward value as a weighted function of the rewards, and maximizing the combined reward value across an area defined by the geofenced perimeter, e.g., using Pareto optimization. Dynamically matching the EVs to at least one of the charging stations may alternatively include performing a temporal difference (TD) model-free on-policy learning algorithm, e.g., Advantage Actor-Critic (A3C).


An aspect of the disclosure includes receiving, via the cloud-based server, contextual information characterizing or describing a vehicle type, a battery temperature, a battery age, a charging type, and local weather conditions for each respective one of the EVs, and a charging type and local weather conditions for each respective one of the charging stations.


Also disclosed herein is a cloud-based server having a processor and a computer-readable storage medium on which is recorded instructions executable by the processor. Execution of the instructions causes the processor to dynamically match an energy demand of a population of EVs with an energy supply of a population of charging stations within a geofenced perimeter by receiving a set of EV information from each respective one of the EVs, and receiving a set of charging station information from each respective one of the charging stations. generating a state of charge (SoC) map and a charging station power map from the set of EV information and the set of charging station information, respectively.


Additionally, execution of the instructions causes the processor to predict the energy supply and the energy demand as a predicted energy supply and a predicted energy demand, respectively, via the cloud-based server using the SOC map and the charging station power map, and to dynamically match the EVs to at least one of the charging stations using a reward function, the predicted energy supply, and the predicted energy demand. This includes generating a rank-ordered listing for each respective one of the EVs and/or each respective one of the charging stations in a manner that maximizes an expected discounted future reward.


Another aspect of the disclosure includes a method for dynamically matching the energy demand of the population of EVs with an energy supply of the population of charging stations within a geofenced perimeter. This embodiment of the method includes receiving, via a cloud-based server, a set of EV information from each respective one of the EVs, and receiving, via the cloud-based server, a set of charging station information from each respective one of the charging stations. The method further includes using a pointer network to define the geofenced perimeter as a dynamically-adjustable polygon, e.g., as a convex hull or minimum convex polygon, generating an SoC map and a charging station power map from the set of EV information and the set of charging station information, respectively, via the cloud-based server, and predicting the energy supply and the energy demand as a predicted energy supply and a predicted energy demand, respectively, via the cloud-based server using the SoC map and the charging station power map.


Additionally, the method in this embodiment includes dynamically matching the EVs to at least one of the charging stations using a reward function, the predicted energy supply, and the predicted energy demand, including generating a rank-ordered listing for each respective one of the EVs and/or each respective one of the charging stations as a table of ranked bidirectional matches in a manner that maximizes an expected discounted future reward, the table having a plurality of rewards including an estimated time of arrival (ETA) and an estimated time to charge (ETC) at a respective one of the charging stations and one or more of an estimated cost of charge (ECC), an estimated expected charge (ECC), or an estimated charging station profit (SCP).


The above features and advantages, and other features and attendant advantages of this disclosure, will be readily apparent from the following detailed description of illustrative examples and modes for carrying out the present disclosure when taken in connection with the accompanying drawings and the appended claims. Moreover, this disclosure expressly includes combinations and sub-combinations of the elements and features presented above and below.





BRIEF DESCRIPTION OF THE DRAWINGS

The accompanying drawings, which are incorporated into and constitute a part of this specification, illustrate implementations of the disclosure which, taken together with the description, serve to explain the principles of the disclosure.



FIG. 1 illustrates a geofenced area of interest including a plurality of electric vehicles (EVs) and electric vehicle supply equipment (EVSE) charging stations, and a cloud-based server in communication with the EVs and EVSE charging stations in accordance with the disclosure.



FIG. 2 is a flow chart describing a computer-implementable method for dynamic matching of energy supply and demand in the geofenced area of interest of FIG. 1.



FIGS. 3 and 4 are representative matching tables describing outputs of the method of FIG. 2.



FIG. 5 depicts a representative dynamic assignment process that is usable within the context of the present method.





The appended drawings are not necessarily to scale, and may present a somewhat simplified representation of various preferred features of the present disclosure as disclosed herein, including, for example, specific dimensions, orientations, locations, and shapes. Details associated with such features will be determined in part by the particular intended application and use environment.


DETAILED DESCRIPTION

The present disclosure is susceptible of embodiment in many different forms. Representative examples of the disclosure are shown in the drawings and described herein in detail as non-limiting examples of the disclosed principles. To that end, elements and limitations described in the Abstract, Introduction, Summary, and Detailed Description sections, but not explicitly set forth in the claims, should not be incorporated into the claims, singly or collectively, by implication, inference, or otherwise.


For purposes of the present description, unless specifically disclaimed, use of the singular includes the plural and vice versa, the terms “and” and “or” shall be both conjunctive and disjunctive, and the words “including,” “containing,” “comprising,” “having,” and the like shall mean “including without limitation.” Moreover, words of approximation such as “about,” “almost,” “substantially,” “generally,” “approximately,” etc., may be used herein in the sense of “at, near, or nearly at,” or “within 0-5% of,” or “within acceptable manufacturing tolerances,” or logical combinations thereof. As used herein, a component that is “configured to” perform a specified function is capable of performing the specified function without alteration, rather than merely having potential to perform the specified function after further modification. In other words, the described hardware, when expressly configured to perform the specified function, is specifically selected, created, implemented, utilized, programmed, and/or designed for the purpose of performing the specified function.


Referring to the drawings, wherein like reference numbers refer to like features throughout the several views, FIG. 1 depicts a plurality of electrified vehicle (EVs) 10, collectively indicated as EVs (A), and a plurality of offboard electric vehicle supply equipment (EVSE) charging stations 12, collectively indicated as charging stations (B). Each respective EV 10 is in remote communication with a cloud-based server (CS) 50 (see FIG. 2), e.g., via an onboard telematic system 11, and is thus configured to participate in a cloud-based energy supply and demand matching methodology in accordance with the disclosure. Such matching occurs within a geofenced perimeter, which in turn may be (i) a fixed static geofenced perimeter 14 or an elastic and dynamically-adjustable geofenced perimeter 140. Use of the cloud-based server 50 of FIG. 2, e.g., through a software application (“app”) or a web-based tool, allows various users to optimally match a charging demand of a plurality of the EVs 10 with available power supplies of a plurality of charging stations 12 located within the geofenced perimeter 14, 140.


As appreciated in the art, energy supply and demand are not static parameters in the context of day-to-day vehicular operations. Rather, supply and demand vary in real-time based on factors such as time of day and the particular number of EVs 10 requiring use of a given charging station 12. The present strategy thus relies on two-way communication between the cloud-based server 50 and the various EVs 10 and charging stations 12 to optimally match supply and demand in real time. Applicable boundary constraints may be applied to effectively limit consideration and computing resources to evaluation of an area defined by the geofenced perimeter 14, 140.


As part of the present methodology, the various EVs 10 located within the geofenced perimeter 14, 140 provide signal inputs to the cloud-based server 50 of FIG. 2 in the form of telematics data, including for example the present location of the EV 10 and a present state of charge (SoC) of a resident battery pack 13. Other inputs to the cloud-based server 50 include information from infrastructure partners within the geofenced perimeter 14, 140, principally the availability and locations of the various charging stations 12, charging capacity of each respective one of the charging stations 12 in terms of available charging power, charging connector types such as SAE J1772, CCS, CHAdeMO, etc., and scheduling information indicative of current and future charging spot availability.


In the performance of the present method, the cloud-based server 50 shown in FIG. 2 includes one or more electronic control units equipped with one or more processors (P), e.g., logic circuits, combinational logic circuit(s), application specific integrated circuit(s), electronic circuit(s), central processing unit(s), semiconductor IC devices, etc., as well as input/output circuit(s), appropriate signal conditioning and buffer circuitry, and other components such as a high-speed clock to provide the described functionality. The cloud-based server 50 also includes an associated tangible computer-readable storage medium, collectively referred to as memory (M) for simplicity, inclusive of read only, programmable read only, random access, a hard drive, etc., whether local, remote or a combination of both. Likewise, to facilitate communications between the cloud-based server 50 and the individual EVs 10 and charging stations 12, the various nodes collectively used to establish the cloud-based server 50 and the various EVs 10 are equipped with or in communication with radio transceivers operable for remotely transmitting and receiving information when performing the present method.


Referring to FIG. 2, the method 100 for dynamically matching an energy demand of a population or fleet of the EVs 10 with an energy supply of a population of charging stations 12 within the geofenced perimeter 14, 140 of FIG. 1 commences with block B102 with receiving, via the cloud-based server 50, a set of EV information from each respective one of the EVs 10, and then generating an EV SoC map (“EV SoC-M”) using the received EV information. That is, each individual EV 10 shown in FIG. 1 uploads its current SoC to the cloud-based server 50 of FIG. 2, e.g., continuously or at regularly scheduled intervals. The uploaded information includes the current GPS location of the EV 10 and its present SoC, with such information being time-stamped to maintain an accurate data record.


As part of block B102, the cloud-based server 50 receives the various SoCs and constructs the noted SoC map. As contemplated herein, the SoC map may be a data file of current locations and SoCs of the EVs 10 located within or traveling through the geofenced perimeter 14, 140 of FIG. 1. In other words, the cloud-based server 50 of FIG. 2 is made aware of the current SoCs of the various EVs 10, and thus the potential total energy demand within an area demarcated by the geofenced perimeter 14, 140 of FIG. 1. The method 100 proceeds to block B104 after constructing the EV SoC map.


Block B103 includes receiving, via the cloud-based server 50, a set of charging station information from each respective one of the charging stations 12 and generating an EVSE charging station power map (“CS-M”) using the charging station information, as a counterpart to the above-described EV SoC map constructed in block B102. That is, the various charging stations 12 of FIG. 1 within the geofenced perimeter 14, 140 report their respective GPS location, current load (specific chargers in use), relative station use level (e.g., not busy, normally busy, very busy), predicted time slot availability, and estimated time to charge. The information in this charging station power map is communicated to a dynamic matching node (“DMM”) of the cloud-based server 50 for further processing at block B106.


At block B104, a demand prediction model (“DPM”) of the cloud-based server 50 may receive contextual information from one or more available sources 52. As contemplated herein, “contextual information” may include weather conditions, vehicle type (manufacturer, model, year, etc.), battery temperature and age, and/or charging system types and ages. Such data may be provided by the individual EVs 10 of FIG. 1 as part of the uploaded data noted above. Additionally, block B104 includes receiving the above-described charging station power map from block B103.


The method 100 as contemplated herein includes predicting the energy supply and the energy demand within the geofenced area 14, 140 via the cloud-based server 50 using the SoC map and the charging station power map. Demand prediction occurs for the area defined by the geofenced perimeter 14, 140, e.g., using machine learning as exemplified below. Outputs from block B104 may include a predicted number of charging stations 12 needed at a future time point, i.e., Ct: t+T, and a predicted charging demand at the future time point, i.e., Dt: t+T, as described in further detail below.


Block B106 of FIG. 2 includes dynamically matching a given EV 10 of FIG. 1 to a list of charging stations 12 based on the predicted values from block B104. This action is performed using the predicted energy demand from block B104 and the predicted energy supply from block B103, with possible implementations for dynamic matching set forth below with reference to FIG. 4. Outputs from block B106 may include ranked recommended bi-directional matches for a given EV 10, which may occur in a customized manner based on a personalized user profile in one or more embodiments. Multi-objective optimization such as utility maximization, e.g., using a weighting scheme or preference information, and/or a tradeoff Pareto optimization, may also be used to perform the functions of block B106 as set forth below.


As shown in FIG. 3, an exemplary output of block B106 is a matched results table 20 (“EC-2-CS”) in which the EVs 10 of FIG. 1, nominally EV-1, EV-2, . . . , EV-n, are matched to one or more of the available charging stations 12, nominally CS-1, CS-2, . . . , CS-6 in this particular non-limiting example. Also included in the matched results table 20 is an estimated time of arrival (ETA) at the particular charging station 12 stated in minutes (min), an estimated time to charge (ETC) likewise stated in minutes, and an estimated cost to charge (ECC) stated in U.S. dollars ($). Thus, a driver of an EV 10 nominally identified as “EV-1” might value arrival time and charging time over charging cost, and thus may prefer use of the charging station 12 identified as CS-1, while a cost-conscious operator unconcerned with charging time would likely opt for charging station CS-5. Moreover, an app-based approach could automatically decide for the user based on previously entered or machine-learned behavior, with the displayed order in FIG. 3 being adjusted based on such a personalized cloud profile. For example, a user might routinely select lowest cost, while another user, perhaps of the same EV 10, might prefer shortest charging duration, and thus the displayed order could change based on the identity of the user.



FIG. 4 depicts another matched results table 200 (“CS-2-EV”) in which a given charging station 12 is matched to a given EV 10, with the charging station 12 being the recipient of the matched results table 200 in this instance. As FIG. 4 is from the perspective of the charging station 12, i.e., the energy supply side within the geofenced perimeter 14, 140, ETA and ETC are retained from FIG. 3, but ECC is replaced with estimated expected charge (EEC) in kilowatt hours (kWh) and expected charging station profit (SCP), again stated in U.S. dollars ($) in this non-limiting example. As with FIG. 3, the matched results table 200 of FIG. 4 provides a rank-ordered recommendation to the various charging stations as to how to prioritize prospective EV charging events based on a multi-objective optimization.


Turning back to FIG. 2, block B106 in particular performs the above bi-directional matching function in the course of performing the present method 100. The method 100 thus includes dynamically assigning or matching the EVs 10 to at least one of the charging stations 12 using a reward function, the predicted energy supply, and the predicted energy demand. An action is then executed by the cloud-based server 50, possibly including generating a rank-ordered listing for each respective one of the EVs 10 and/or each respective one of the charging stations 12 in a manner that maximizes an expected discounted future reward.


In general, block B106 of FIG. 2 receives a set of observations, denoted hereinbelow as EV states (vt) and charging station states (ct) for simplicity. Block B106 then performs an action (at), in this case by ranking recommended bi-directional matches, with two examples shown in FIGS. 3 and 4. Block B106 likewise outputs a reward (rt), e.g., in terms of ETA, ETC, ECC, EEC, and/or SCP, as likewise depicted in FIGS. 3 and 4. The problem solved by the cloud-based server 50 in the course of executing the method 100 is thus one of maximizing an optimization function having multiple objectives, which may occur in some embodiments via reinforcement learning.


Data-driven Optimal Demand-Supply Matching for EV Charging: a possible approach for performing block B106 of FIG. 2 defines state variables as follows:

    • vt={vt,1, vt,2, . . . , vt,N}: the EV status at time t, where vt,i is a vector which consists of the current GPS location (latitude and longitude) of a given EV 10, i.e., EV(i), and its SoC;
    • ct={ct,1, ct,2, . . . , ct,M}: the charging station 12 status at time (t), where ct,k is a vector which consists of the GPS location (latitude and longitude) of the charging station (k), current load (chargers in use), crowd level (not busy, busy, too busy), predicted time slot availability and estimated time of charge. Given a set of charging requests at time (t), the cloud server can predict the number of charging stations Ct: t+T that will be needed in each geofenced area for T time slots ahead.
    • Dt:T={dt, . . . , dt+T}: the predicted future charging demand from time t to tots in each geofenced perimeter 14, 140.
    • st={Vt, Ct: t+T, Dt: t+T}: the environment state at time (t).


An objective of the cloud-based server 50 is to efficiently assign an EV 10 of FIG. 1 to a charging station 12 with a certain geofenced AOI. To do this, the cloud-based server 50 considers ETA, ETC, ECC, EEC, and SCP as competing objectives, with the constraint of “satisfy all charging demands” of the various EVs 10 located within the geofenced area. Thus, a combined reward value (rt) across an area defined by the geofenced perimeter 14, 140 may be expressed mathematically as a weighted function of the rewards:






r
tscp×SCP−ωeta×ETA−ωetc×ETC−ωecc×ECC


with the various weighting terms ωscp, ωeta, ωetc, and ωecc reflecting a level of importance of each criterion, summing to 1 without a loss of generality.


Due to the conflicting nature between objective functions, duality is used to convert minimization (ETA, ETC, and ECC) into maximization to form the overall reward. At each time step (t), that is, the dynamic matching function performed at block B106 obtains a representation of the environment, sat, and a reward, rt. Based on this information, an action at is taken to direct EVs 10 to charging stations 12 such that the expected discounted future reward is maximized, i.e.,:






max



a
t






j
=
1





γ

j
-
1





r
j

(


a
t

,

s
t


)







where 0<γ<1 is a time discounting factor providing a penalty to uncertainty of future awards. The cloud-based server 50 may then maximize the expected discounted future reward. This preference-based multi-objective optimization procedure involves having a preference vector or a weighting scheme that is highly subjective. Pareto optimization as appreciated in the art may also be used to find multiple tradeoff solutions and choose one using higher-level information from the user.


Referring briefly to FIG. 5, the method 100 of the present disclosure may use a temporal difference (TD) model-free on-policy learning algorithm such as asynchronous Advantage Actor-Critic (A3C) to learn an optimal action (At) to take given a current state (Vt) in such a way that the reward is maximized. Dynamic assignment (DYN ASMT) occurs at block B206 to produce the action At, which is analogous to block B106 as described above. As appreciated in the art, TD learning refers to a class of model-free reinforcement learning in which a machine “learns” by bootstrapping from a current estimate of a given value function. Such a technique may include a global actor-critic neural network 30 along with many local actor-critic networks, with an actor network 32 and a critic network 34 shown in FIG. 5 for simplicity. In such a network 30, the “actor” nodes 320 decide on a particular action to take, while the “critic” nodes 340 inform the actor nodes 320 how optimal or suitable the action was, and how to adjust. In this framework, the “critic” nodes 340 learn the value function, i.e., the environmental state (stp) at time (t), while multiple “actor” nodes 320 are trained in parallel, in this case by observing the environmental state (set), and are synchronized with global parameters from time to time.


Pointer Networks (Ptr-Net)-based Convex Hull Dynamic Geofence: the present teachings may be applied to the fixed static geofenced perimeter 14 of FIG. 1 or the elastic dynamic geofenced perimeter 140. For instance, the method 100 described herein may include using a pointer network to define the geofenced perimeter 140 as a convex hull or minimum convex polygon as shown. Some implementations may use a neural network to include all charging stations 12 and EVs 10 within the polygon, keeping reported EVs 10 and charging stations 12 within the geofence perimeter 140. As appreciated by those skilled in the art, such a pointer network may be defined as follows:


Input: a sequence of vectors x=(X1, . . . , xn)


Ptr-Net Outputs: a sequence of integer indices, c=(c1, . . . , cm) and 1≥ci≥n.


Encoder hidden states: (hi, . . . , hn)


Decoder hidden states: (si, . . . , sm) where si is the output gate after cell activation in the decoder.


The Ptr-Net applies additive attention between states and then normalizes by the softmax function to model the output conditional probability:






y
i
=p(ci|c1, . . . , x)=softmax(score (st, hi))=softmax(vaT tan h(Wa[st; hi]))


Asynchronous Advantage Actor-Critic (A3C): as understood in the art, the following symbols are commonly used in the above-noted A3C algorithm:













SYMBOL
MEANING







s ϵ S
States st = {Vt, Ct:t + T, Dt:t + T }


a ϵ A
Actions (e.g., serving EV1 and EV3 using CS-2;



EV2 and EV4 using CS-1)


r ϵ R
Rewards


st, at, rt
State, action, and reward at time step (t) of one iteration


γ
Discount factor; penalty to uncertainty of future rewards;



0 < γ < 1


πθ(a|s)
Stochastic policy (dynamic assignment strategy);



probability of action (a) in state (s)



given parameters (θ)


θ
Policy parameter vector


ω
Function approximation parameter


Qθ(a|s)
Value of (state, action) pair when following



policy π parameterized by θ


Vω(s)
Value of state s when following policy π parameterized by ω


Aθ(a|s)
Advantage function Aθ (a|s) = Qθ(als) − Vω(s)









Exemplary A3C pseudocode appears as follows:

    • Initialize randomly the global parameter vectors θ and ω;
    • Initialize the counter T with value 1;
    • while T<Tmax do
      • Reset gradient dθ←0, dω←0
      • Synchronize local parameters {acute over (θ)}, {acute over (ω)} with global parameters θ and ω; {acute over (θ)}←θ and {acute over (ω)}←ω
      • While t<tmax or ś is not terminal do
        • Pick the action at←πθ(a, s) and find the corresponding state st+1 and reward rt;
        • If ś is terminal then
      • The variable R is equal to rt;
        • Else
          • The variable R is updated according to R←rt+γR
        • End
        • Accumulate the policy gradient with respect to θ:






dθ←dθ+∇
{acute over (θ)}log(π{acute over (θ)}(a, s))(R=V{acute over (ω)}(s))

        • Accumulate the policy gradient with respect to {acute over (ω)}:






dω←dω+2∇{acute over (ω)}(R−V{acute over (ω)}(s))(R−V{acute over (ω)}(s))

        • Make the state s equal to the new state st+1 and increment t by 1;
      • End
      • Update asynchronously global θ and ω using dθ and dω;
      • Increment the variable T by 1
    • End


The above-described solutions provide systems and methods for optimized dynamic bidirectional matching of multiple demands of charging EVs 10 with available power supplies of multiple charging stations 12, as shown in FIG. 1. The cloud-based solutions create SoC maps and power maps from the EVs 10 and charging stations 12, respectively, based on information collected from connected EVs 10 and connected charging stations 12 within a specific static geofence or elastic dynamic geofence, with both approaches shown in the various Figures.


As will be appreciated by those skilled in the art, the present disclosure provides a data-driven demand prediction approach to matching supply and demand within the geofenced area. Optimal matching between EV demand and charging station supply uses a temporal difference model-free on-policy learning algorithm, e.g., A3C, to learn the optimal pairings of EVs 10 and charging stations 12 within the geofenced area. A reward is thus obtained for each time step. The cloud-based server 50 then executes an action to match the EVs 10 with charging stations 12 such that the expected discounted future reward is maximized. Taken as a whole, the proposed solutions are intended to help reduce range anxiety and enable optimal charge scheduling from the perspective of the various EV drivers as well as charging station/infrastructure operators. These and other attendant benefits will be appreciated by those skilled in the art in view of the foregoing disclosure.


The detailed description and the drawings or figures are supportive and descriptive of the present teachings, but the scope of the present teachings is defined solely by the claims. While some of the best modes and other embodiments for carrying out the present teachings have been described in detail, various alternative designs and embodiments exist for practicing the present teachings defined in the appended claims. Moreover, this disclosure expressly includes combinations and sub-combinations of the elements and features presented above and below.

Claims
  • 1. A method for dynamically matching an energy demand of a population of electric vehicles (EVs) with an energy supply of a population of charging stations within a geofenced perimeter, the method comprising: receiving, via a cloud-based server, a set of EV information from each respective one of the EVs;receiving, via the cloud-based server, a set of charging station information from each respective one of the charging stations;generating a state of charge (SoC) map and a charging station power map from the set of EV information and the set of charging station information, respectively, via the cloud-based server;predicting the energy supply and the energy demand as a predicted energy supply and a predicted energy demand, respectively, via the cloud-based server using the SoC map and the charging station power map; anddynamically matching the EVs to at least one of the charging stations using a reward function, the predicted energy supply, and the predicted energy demand, including generating a rank-ordered listing for each respective one of the EVs and/or each respective one of the charging stations in a manner that maximizes an expected discounted future reward.
  • 2. The method of claim 1, wherein the geofenced perimeter is a dynamically-adjustable polygon, further comprising using a pointer network to define the geofenced perimeter as a convex hull or minimum convex polygon.
  • 3. The method of claim 1, wherein dynamically matching the EVs to at least one of the charging stations includes generating a table of ranked bidirectional matches, the table having a plurality of rewards including an estimated time of arrival (ETA) and an estimated time to charge (ETC) at a respective one of the charging stations.
  • 4. The method of claim 3, wherein the plurality of rewards includes one or more of an estimated cost of charge (ECC), an estimated expected charge (ECC), or an estimated charging station profit (SCP).
  • 5. The method of claim 3, wherein dynamically matching the EVs to at least one of the charging stations includes calculating a combined reward value as a weighted function of the rewards, and maximizing the combined reward value across an area defined by the geofenced perimeter.
  • 6. The method of claim 5, wherein maximizing the combined reward value includes using Pareto optimization.
  • 7. The method of claim 6, wherein dynamically matching the EVs to at least one of the charging stations includes performing a temporal difference (TD) model-free on-policy learning algorithm.
  • 8. The method of claim 7, wherein the TD model-free on-policy learning algorithm includes Advantage Actor-Critic (A3C).
  • 9. The method of claim 1, further comprising: receiving, via the cloud-based server, contextual information characterizing a vehicle type, a battery temperature, a battery age, a charging type, and local weather conditions for each respective one of the EVs, and a charging type and local weather conditions for each respective one of the charging stations.
  • 10. A cloud-based server comprising: a processor; anda computer-readable storage medium on which is recorded instructions executable by the processor, wherein executing the instructions causes the processor to dynamically match an energy demand of a population of electric vehicles (EVs) with an energy supply of a population of charging stations within a geofenced perimeter by: receiving a set of EV information from each respective one of the EVs;receiving a set of charging station information from each respective one of the charging stations;generating a state of charge (SoC) map and a charging station power map from the set of EV information and the set of charging station information, respectively;predicting the energy supply and the energy demand as a predicted energy supply and a predicted energy demand, respectively, via the cloud-based server using the SoC map and the charging station power map; anddynamically matching the EVs to at least one of the charging stations using a reward function, the predicted energy supply, and the predicted energy demand, including generating a rank-ordered listing for each respective one of the EVs and/or each respective one of the charging stations in a manner that maximizes an expected discounted future reward.
  • 11. The cloud-based server of claim 10, further comprising using a pointer network to define the geofenced perimeter as a convex hull or minimum convex polygon.
  • 12. The cloud-based server of claim 10, wherein dynamically matching the EVs to at least one of the charging stations includes generating a table of ranked bidirectional matches, the table having a plurality of rewards including an estimated time of arrival (ETA) and an estimated time to charge (ETC) at a respective one of the charging stations.
  • 13. The cloud-based server of claim 12, wherein the plurality of rewards includes one or more of an estimated cost of charge (ECC), an estimated expected charge (ECC), or an estimated charging station profit (SCP).
  • 14. The cloud-based server of claim 13, wherein dynamically matching the EVs to at least one of the charging stations includes calculating a combined reward value as a weighted function of the rewards, and maximizing the combined reward value across an area defined by the geofenced perimeter.
  • 15. The cloud-based server of claim 14, wherein maximizing the combined reward value includes using Pareto optimization.
  • 16. The cloud-based server of claim 11, wherein dynamically matching the EVs to at least one of the charging stations includes performing a temporal difference (TD) model-free on-policy learning algorithm.
  • 17. The cloud-based server of claim 16, wherein the TD model-free on-policy learning algorithm includes Advantage Actor-Critic (A3C).
  • 18. A method for dynamically matching an energy demand of a population of electric vehicles (EVs) with an energy supply of a population of charging stations within a geofenced perimeter, the method comprising: receiving, via a cloud-based server, a set of EV information from each respective one of the EVs;receiving, via the cloud-based server, a set of charging station information from each respective one of the charging stations;using a pointer network to define the geofenced perimeter as a convex hull or minimum convex polygon;generating a state of charge (SoC) map and a charging station power map from the set of EV information and the set of charging station information, respectively, via the cloud-based server;predicting the energy supply and the energy demand as a predicted energy supply and a predicted energy demand, respectively, via the cloud-based server using the SoC map and the charging station power map; anddynamically matching the EVs to at least one of the charging stations using a reward function, the predicted energy supply, and the predicted energy demand, including generating a rank-ordered listing for each respective one of the EVs and/or each respective one of the charging stations as a table of ranked bidirectional matches in a manner that maximizes an expected discounted future reward, the table having a plurality of rewards including an estimated time of arrival (ETA) and an estimated time to charge (ETC) at a respective one of the charging stations and one or more of an estimated cost of charge (ECC), an estimated expected charge (ECC), or an estimated charging station profit (SCP).
  • 19. The method of claim 18, wherein dynamically matching the EVs to at least one of the charging stations includes calculating a combined reward value as a weighted function of the rewards, and maximizing the combined reward value across an area defined by the geofenced perimeter, and wherein dynamically matching the EVs to at least one of the charging stations includes performing a temporal difference (TD) model-free on-policy learning algorithm.
  • 20. The method of claim 18, further comprising: receiving, via the cloud-based server, contextual information describing a vehicle type, a battery temperature, a battery age, a charging type, and local weather conditions for each respective one of the EVs, and a charging type and local weather conditions for each respective one of the charging stations.