An aspect of this invention relates to a delivery plan creation device, a delivery plan creation method, and a program.
Delivery services provided for logistics have drawn attention in recent years. The delivery services include not only delivering luggage such as parcels but also delivering in preparation for disasters such as earthquakes and typhoons. Fuel is indispensable not only for generating warmth but also securing electric power. For example, when power supply from a power plant is interrupted due to a disaster or the like, a communication service provider operates a private generator installed in a building for provision of a communication service (a communication building) to continue providing the communication service. The service provider (the communication provider, the delivery service provider, etc.) delivers and supplies fuel to operate the private generator to the communication building using a delivery vehicle, etc.
A fuel depletion period indicates a period during which fuel of a private generator is depleted. That is, in this period, private power generation cannot be performed, and therefore the communication service may not be continued. The service provider should create a delivery plan to make the fuel depletion period zero or shorten the fuel depletion period to as short as possible. In other words, the service provider is required not only to have fuel delivered to the communication building before fuel is depleted but also to quickly deliver fuel to the communication building suffering depletion of fuel and restore the communication service at an early stage.
A delivery plan indicates how much fuel should be delivered to a plurality of destinations in which order. A delivery plan should be determined according to various situations such as the location, fuel situation, and traffic situation with respect to each building. For this reason, a long time and skills are required for a person to review and create a delivery plan. Despite the fact that it is intricate to train personnel skilled in dealing with disasters because disasters rarely occur, once a disaster occurs, the situation becomes urgent. A technique capable of causing a delivery plan to be automatically and efficiently created in a short period of time has been demanded.
PTL 1 discloses a system for creating a delivery plan for consumer goods such as LP gas cylinders. This document proposes a technique for automatically creating an efficient delivery plan taking the amount of remaining consumer goods at a destination into consideration.
[PTL 1] Japanese Patent Application Laid-open No. 2019-219783
In addition, the following methods are available.
For example, there is a method of performing delivery in an order in which the total travel distance of a delivery vehicle becomes shorter. However, in this method, a destination located closer to the delivery vehicle is given priority. There is a possibility of delivery to a distant destination with a small amount of remaining fuel being delayed and thus fuel being depleted.
Alternatively, there is a method of performing delivery in an order from destinations with smaller amounts of remaining fuel. However, in this method, the locations of the destinations and the times required for the delivery are not considered. Therefore, in a case where destinations with smaller amounts of remaining fuel are scattered, an inefficient delivery plan is likely to be created.
Consequently, there is a possibility of the fuel being depleted at many destinations.
Alternatively, there is a method of creating all delivery plans and extracting the best plan from among them. However, in this method, when there are many destinations and delivery vehicles, an enormous number of delivery plans can be created. A long period of time may be required for the calculation.
It is hard to say that an effective delivery plan can be efficiently created in any of these methods.
The present invention has been made by paying attention to the above circumstances, and it is an object of the present invention to provide a technique enabling a delivery plan that can shorten a fuel depletion period to be efficiently created.
A delivery plan creation device according to an aspect of this invention creates a delivery plan including an order of delivery of fuel to each of destinations using a delivery vehicle and an amount of fuel to be supplied. The delivery plan creation device includes a database, a storage unit, and a processor. The database holds environment information including destination information related to the destination and delivery vehicle information related to the delivery vehicle. The storage unit stores a trained model created by training a neural network having at least an input layer and an output layer in advance based on different environment information. The processor includes an acquisition unit and a creation unit. The acquisition unit accesses the database to acquire the environment information and create an input condition that is a premise of the delivery plan from the environment information. The creation unit inputs the input condition to the neural network in which the trained model has been reflected to create the delivery plan.
According to one aspect of the present invention, it is possible to provide a technique capable of causing a delivery plan that can shorten a fuel depletion period to be efficiently created.
Hereinafter, embodiments according to the present invention will be described with reference to the drawings.
The interface unit 13 is connected to a network 100, and can access, for example, a traffic situation providing system 2 to acquire information such as a current traffic situation. In addition, the interface unit 13 outputs a delivery plan 3 created by the delivery plan creation device 10, for example, in response to a request from an operator of a vehicle dispatch center.
The storage 12 is a non-volatile storage device (block device), for example, a hard disk drive (HDD) or a solid state drive (SSD). The storage 12 stores an environment information database 12a in addition to basic programs such as an operating system (OS) and a device driver and programs for realizing functions of the delivery plan creation device 10.
The memory 14 of
The delivery plan 14c is information including an order of delivery of fuel to each of the destinations (the building A, building B, and building C) by the delivery vehicle 1 and the amount of fuel to be supplied to each destination (i.e., an unloading amount). The delivery plan 14c is created by inputting a specific condition to the trained model 14b. Learning of a neural network and creation of a delivery plan will be described in detail below.
Furthermore, the processor 11 shown in
The processor 11 includes an acquisition unit 111, an updating unit 112, a reward calculation unit 113, a learning unit 114, and a creation unit 115 as functional blocks (program modules) according to an embodiment. These functional blocks are processing functions realized by the processor 11 executing commands included in the program 14a. In other words, the delivery plan creation device 10 according to the present invention can be realized by a computer and a program. The program can be recorded and distributed on a recording medium such as an optical medium. Alternatively, the program can also be provided via a network.
The acquisition unit 111 accesses the environment information database 12a to acquire environment information, and creates an input condition as a premise of a delivery plan from the acquired environment information.
The creation unit 115 inputs the created input condition to a neural network reflecting the trained model 14b to create a delivery plan.
The reward calculation unit 113 calculates a reward value having a higher value of a delivery action that is an output of the neural network as the fuel depletion period at the destination becomes shorter. That is, an action to shorten a period of fuel depletion of the private generator installed at the destination has a higher value.
The learning unit 114 repeatedly executes simulations using a set having different environmental information and reward values. Then, the learning unit 114 creates a trained model by updating weighting parameters of the neural network based on the results of the respective executed simulations. The created trained model is stored in the memory 14 (trained model 14b).
The updating unit 112 updates the environment information of the environment information database 12a based on the results of the respective executed simulations.
In an embodiment, a simulation using a set of different input conditions created based on the environment information database 12a and reward values for the input conditions is repeated. Then, by updating weighting parameters of the neural network based on the results of the simulation, the trained model 14b is created.
In
In
Next, the processor 11 acquires the environment information from the environment information database 12a, and creates an input condition (a state of the environment) for computing a delivery plan (step S3). The obtained input condition is input to the neural network of the creation unit 115. Here, the creation of the input condition will be described.
The travel time for each destination can be acquired by inputting location information of each destination to the traffic situation providing system 2, for example. That is, when a request including the location information of a destination is sent to the traffic situation providing system 2, a reply including the travel time is returned.
Next, the processor 11 acquires, for all delivery vehicles, the amount of remaining fuel, the amount of fuel to be supplied when each building is selected, the travel time (the time required for travel), and the supply time (the time required to supply the fuel) (step S32).
Here, the amount of fuel to be supplied can be calculated using, for example, equation (2).
The travel time can be calculated based on the travel time between the destinations obtained in step S31, the present location of the delivery vehicle, the traffic situation or the like at a specific point of time acquired by accessing the traffic situation providing system 2. The supply time can be calculated using, for example, equation (3).
Returning to
Furthermore, the processor 11 determines whether a termination condition for the simulation is satisfied (step S8), and repeats the procedure from step S3 until the termination determination becomes Yes (step S9). In step S9, for example, when an elapsed time t from the start of the simulation passes a predetermined time tend, the termination determination is Yes. Alternatively, when the delivery simulation for all of the destinations is completed, the termination determination is Yes.
Furthermore, the processor 11 determines whether a termination condition for the learning mode is satisfied (step S10), and repeats the procedure from step S2 until the termination determination becomes Yes (step S11). In step S11, for example, when a predetermined number of simulations are executed, the termination determination is Yes.
In
Next, the processor 11 determines the next delivery destination (step S24), and then updates the environment information of the environment information database 12a (step S25). Furthermore, the processor 11 determines whether a termination condition for the simulation is satisfied (step S26), and repeats the procedure from step S23 until the termination determination becomes Yes (step S27). In step S27, for example, when an elapsed time t from the start of the simulation passes a predetermined time tend, the termination determination is Yes. Alternatively, when the delivery simulation for all of destinations is completed, the termination determination is Yes.
In
Next, the processor 11 acquires a supply time tc and the amount of fuel to be supplied at the supply destination (step S54), and updates the amount of remaining fuel (supply possible amount) of the delivery vehicle and the amount of remaining fuel of each building (step S55). The amount of remaining fuel of the delivery vehicle can be calculated from the amount of remaining fuel at that moment, the amount of fuel to be supplied to the delivery destination, the fuel consumption rate, and tc.
Further, the processor 11 acquires the state S after the action (t + tm + tc) (step S56), and then determines a mode of the simulation (step S57). In the output mode, the processor 11 stores the environment after the action in the environment information database 12a (step S58).
On the other hand, if the learning mode is set in the step S57, the processor 11 inputs the state before the action St, the state after the action S (t + tm + tc) to the reward calculation unit 113 to calculate a state a reward value obtained from the action (step S59). Here, the calculation of the reward value will be described.
The reward calculation unit 113 calculates a reward value for updating the weighting parameter of the neural network of the creation unit 115. The reward value can be calculated as, for example, the sum of a positive reward (rewarding) obtained by delivering fuel and a negative reward (penalty) brought by fuel depletion. Further, either of a reward and a penalty may be calculated.
A positive reward can be calculated, for example, by inputting the time left at the moment with respect to the maximum time left until the fuel is depleted into a predetermined reward function. A penalty can be calculated by inputting the number of destinations in which the fuel has been depleted and the time that has elapsed since the fuel was used up to a predetermined reward function.
The reward is calculated according to the policy that, for example, a higher reward will be given when the time left at the moment (the current amount of fuel/the fuel consumption rate) with respect to the maximum time left until the amount of fuel becomes zero (the maximum amount of fuel/the fuel consumption rate) has been calculated and fuel has been supplied to the destination with a lower calculated value. Alternatively, a higher reward may be given when fuel has been supplied to a destination where the current amount of remaining fuel with respect to the maximum amount of fuel is smaller.
That is, the reward calculation unit 113 calculates a reward value based on at least any of the time left at the moment with respect to the maximum time left until the fuel is depleted, the current amount of remaining fuel with respect to the maximum amount of fuel, the number of destinations where fuel has been depleted, and the time that has elapsed since fuel was depleted.
A negative reward (penalty) can be calculated according to a policy that a heavier penalty will be given when, for example, the number of destinations in which the time left until fuel runs out is zero or shorter, or when the time is long. For example, equation (4) can be applied.
Alternatively, equation (5), may be applied.
Alternatively, equation (6), may be applied.
A reward value can be obtained from, for example, equation (7) by combining a reward and a penalty.
Description will now return to
On the other hand, if the random number is greater than ε in step S42 (Yes), the processor 11 inputs an input condition created by the acquisition unit 111 to the neural network and selects the delivery destination having the highest value (step S).
In the embodiment, a highly effective delivery plan to prevent fuel depletion can be calculated by utilizing the neural network as described above. That is, a plurality of input conditions are created from environment information registered in a database in advance, and a trained model is created by repeating a simulation using a neural network. Then, information acquired from the traffic situation providing system is also input to the trained model to automatically search for the delivery route and create the delivery plan. Furthermore, the result of an action can be evaluated numerically. That is, by reflecting positive evaluation of delivery to a destination at which the time left until fuel depletion is shorter and negative evaluation of delivery made after fuel depletion on learning, accuracy in route search and creation of a delivery plan can be automatically improved.
In the related art, when power supply to a communication building is interrupted due to the occurrence of a disaster, it is necessary to manually create a delivery plan considering the location, the fuel state, the traffic state, and the like of each building, and it takes time and skill for the examination of the delivery plan.
According to an embodiment with respect to this problem, it is possible to obtain an optimum solution (an optimum route) in an approach using a neural network based on learning of various input information and cases in consideration of environment conditions during a disaster such as a fuel state. That is, according to the embodiment, it is possible to compute a highly effective delivery plan to prevent fuel depletion by utilizing the neural network.
Thus, according to the embodiment, it is possible to efficiently create a delivery plan capable of shortening a fuel depletion period. As a result, the delivery plan for shortening the time for which the fuel at the destination has been depleted can be automatically determined in a short time, and the skillless of creation of the delivery plan and time reduction can be realized.
Further, the present invention is not limited to the above-described embodiment. For example, the reward function is not limited to the one described with reference to the drawings. In other words, the present invention is not limited to the above-described embodiment as is, and in the implementation stage, various modifications of the constituent components are possible without departing from the spirit of the invention. Also, various inventions can be formed by suitably combining a plurality of constituent components disclosed in the above-described embodiment. For example, some constituent elements may be omitted from all of the constituent elements in the embodiments. Furthermore, constituent elements of different embodiments may be combined as appropriate.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/JP2020/031648 | 8/21/2020 | WO |