This application is based upon and claims priority to Chinese Patent Application No. 202310813910.0, filed on Jul. 5, 2023, the entire contents of which are incorporated herein by reference.
The present disclosure relates to the technical field of wireless communications, specifically to an air-ground joint trajectory planning and offloading scheduling method and system for distributed multiple objectives.
With the popularity of 5th Generation Mobile Communication Technology (5G) and the wide application of mobile edge computing (MEC) technology, task data generated by a large number of devices need to be offloaded to an edge server with high computing power for processing. However, existing MEC is not suitable for a scenario with a computation-intensive and latency-critical task. Flexibility and high maneuverability of an unmanned aerial vehicle (UAV), especially a high-probability line-of-sight link channel, provide a more reliable connection for communication and multi-user scheduling and resource allocation. In addition, in the context of UAV-assisted MEC, the most advanced research focuses on a scenario in which a terrestrial device can determine to execute a computing task locally or offload the computing task to the UAV, regardless of a situation in which a base station (BS) cooperates with the UAV to provide a service for a terminal, which is not applicable to an actual scenario. Moreover, considering high dynamic performance of a real environment, a channel gain and a statistical characteristic that are generated by the computing task are unknown, and deterministic optimization may vary with time, but unpredictable channel propagation cannot achieve a good solving result.
There are also some task offloading scheduling methods in the prior art. For example, the invention patent with the Patent Application No. CN114599102A and entitled “Method for Offloading Linearly Dependent Task of UAV-Assisted Edge Computing Network” uses convex optimization and dynamic programming algorithms to jointly optimize an offloading decision, resource allocation, and a UAV trajectory to minimize an energy consumption. For another example, the invention patent with the Patent Application No. CN113286314A and entitled “Method for Base Station Deployment and User Association Based on Q-Learning Algorithm” utilizes a table-based Q-learning method to jointly optimize BS deployment and user association for the UAV to maximize the sum of transmission rates of users within a system. However, none of the above methods takes into account simultaneous optimization and solving for a plurality of objectives such as a demand of a user for the computing task and a system energy consumption.
In view of the deficiencies in the prior art, the present disclosure provides an air-ground joint trajectory planning and offloading scheduling method and system for distributed multiple objectives.
To achieve the above objective, the present disclosure adopts following technical solutions:
An air-ground joint trajectory planning and offloading scheduling method for distributed multiple objectives is provided, where in the method, a UAV fixed at a flight altitude and a terrestrial BS at a fixed location jointly serve a plurality of terrestrial users at fixed locations, and the method specifically includes the following steps:
step 1: obtaining, by the UAV, current state information, where the state information includes a location of the UAV in a current timeslot t and a total amount of unprocessed data of all devices in a previous timeslot t−1, and all the devices include the UAV, the BS, and the plurality of users;
step 2: selecting, by the UAV, an action, selecting a flight direction from a preset flight direction set by minimizing a total energy consumption of all the devices and a total amount of unprocessed data of all the devices, and flying a fixed distance;
step 3: before the UAV reaches a new location, obtaining, by each user, current state information;
step 4: selecting, by all the users, an action in parallel, determining a task offloading scheduling method of each user from a preset offloading strategy set by minimizing the total energy consumption of all the devices and the total amount of the unprocessed data of all the devices, and executing the task offloading scheduling method in the current timeslot t;
step 5: after task offloading scheduling of the users is completed in the current timeslot t, updating the location of the UAV as a location in a next timeslot t+1, where the UAV and each user receive a feedback for a current action; and
step 6: repeating steps 1 to 5 in each timeslot to obtain an optimal flying and offloading strategy based on the state information and the feedback for the current action, where the flying and offloading strategy includes the flight direction of the UAV and the task offloading scheduling method of each user.
In order to optimize the technical solution, the following specific measures are also used.
Further, the flight direction set includes eight flight directions: up, down, left, right, upper right, lower right, upper left, and lower left; and the offloading strategy set includes three task offloading scheduling methods: local computing, offloading to the UAV, and offloading to the BS.
Further, the total energy consumption of all the devices is calculated according to a following formula:
where Dt represents a total energy consumption of all the devices in the timeslot t, M represents a quantity of users, m∈{1, 2, . . . ,M}, and Dm,t, DUAV,t, and DBS,t respectively represent cumulative computing tasks in queues of a user m , the UAV, and the BS in the timeslot t; and
the total amount of the unprocessed data of all the devices is calculated according to a following formula:
where Et represents a total amount of unprocessed data of all the devices in the timeslot t, EUAV,m,ttrans and EBS,m,ttrans respectively represent transmission energy consumptions of offloading a task by the user m to the UAV and the BS in the timeslot t, EUAV,m,tcp and EBS,m,tcp respectively represent energy consumptions of computing the task by the user m on the UAV and the BS in the timeslot t, and αm,t,UE, αm,t,UAV, and αm,t,BS respectively represent operations of performing local computing, offloading to the UAV, and offloading to the BS by the user m in the timeslot t, where when the local computing is performed, αm,t,UE=1, otherwise αm,t,UE=0; when the offloading to the UAV is performed, αm,t,UAV=1, otherwise αm,t,UAV=0; and when the offloading to the BS is performed, αm,t,BS=1, otherwise αm,t,BS=0.
Further, the feedback for the current action is as follows:
where
represents an upper bound of
, ε{ } represents averaging, et=−Et, and dt=−Dt.
Further, the optimal flying and offloading strategy is determined according to a following formula:
π=argmaxπ{wrT
wr=[we, wd]T
where π represents the optimal flying and offloading strategy, wr represents a weight vector, wrT represents transposition of wr, and we and wd respectively represent weight values for the energy consumption and an amount of the unprocessed data.
In addition, the present disclosure also provides an air-ground joint trajectory planning and offloading scheduling system for distributed multiple objectives, where all devices of the system include a UAV fixed at a flight altitude, a terrestrial BS at a fixed location, and a plurality of terrestrial users at fixed locations, and the UAV and the BS jointly serve the plurality of terrestrial users;
the UAV is configured to perform following operations: obtaining current state information, where the state information includes a location of the UAV in a current timeslot t and a total amount of unprocessed data of all the devices in a previous timeslot t−1; selecting an action by minimizing a total energy consumption of all the devices and a total amount of unprocessed data of all the devices, selecting a flight direction from a preset flight direction set, and flying a fixed distance; after task offloading scheduling of the users is completed in the current timeslot t, updating the location as a location in a next timeslot t+1; and receiving a feedback for a current action, and obtaining an optimal flying strategy based on the state information and the feedback for the current action in the next timeslot; and
the user is configured to perform following operations: before the UAV reaches a new location, obtaining, by each user, current state information; selecting, by all the users, an action in parallel, determining a task offloading scheduling method of each user from a preset offloading strategy set by minimizing the total energy consumption of all the devices and the total amount of the unprocessed data of all the devices, and executing the task offloading scheduling method in the current timeslot t; and receiving the feedback for the current action, and obtaining an optimal offloading strategy based on the state information and the feedback for the current action in the next timeslot.
Further, the flight direction set includes eight flight directions: up, down, left, right, upper right, lower right, upper left, and lower left; and the offloading strategy set includes three task offloading scheduling methods: local computing, offloading to the UAV, and offloading to the BS.
Further, the total energy consumption of all the devices is calculated according to a following formula:
where Dt represents a total energy consumption of all the devices in the timeslot t, M represents a quantity of users, m∈{1, 2, . . . ,M} , and Dm,t, DUAV,t, and DBS,t respectively represent cumulative computing tasks in queues of a user m, the UAV, and the BS in the timeslot t; and
the total amount of the unprocessed data of all the devices is calculated according to a following formula:
where Et represents a total amount of unprocessed data of all the devices in the timeslot t, EUAV,m,ttrans and EBS,m,ttrans respectively represent transmission energy consumptions of offloading a task by the user m to the UAV and the BS in the timeslot t, EUAV,m,tcp and EBS,m,tcp respectively represent energy consumptions of computing the task by the user m on the UAV and the BS in the timeslot t, and αm,t,UE, αm,t,UAV, and αm,t,BS respectively represent operations of performing local computing, offloading to the UAV, and offloading to the BS by the user m in the timeslot t, where when the local computing is performed, αm,t,UE=1, otherwise αm,t,UE=0; when the offloading to the UAV is performed, αm,t,UAV=1, otherwise αm,t,UAV=0; and when the offloading to the BS is performed, αm,t,BS=1, otherwise αm,t,BS=0.
Further, the feedback for the current action is as follows:
where
represents an upper bound of
ε{ } represents averaging, et=−Et, and dt=−Dt.
Further, the optimal flying strategy and the optimal offloading strategy are determined according to a following formula:
π=argmaxπ{wrT
wr=[we, wd]T
where π represents the optimal flying and offloading strategy, wr represents a weight vector, wrT represents transposition of wr, and we and wd respectively represent weight values for the energy consumption and an amount of the unprocessed data.
The present disclosure has following beneficial effects: The multi-objective method and system based on a distributed framework in the present disclosure can express a decision-making process of trajectory planning/offloading scheduling as independent Markov decision-making processes, enabling the UAV and the terrestrial user to optimize the trajectory planning and the offloading scheduling by minimizing the energy consumption and a task bit backlog, and also considers matching of time and space resources in a highly dynamic network environment, in other words, in a situation where all channel information is unknown. The present disclosure effectively avoids a curse of dimensionality caused by an increase in the quantity of users and an exponential increase in state/action space, can effectively resolve inherent problems of poor timeliness and inapplicability to a large-scale user scenario in a centralized method, and can also ensure an overall low energy consumption and task bit backlog of the system.
The present disclosure will be described in further detail below in combination with accompanying drawings.
As shown in
In each timeslot t, for each user, offloading scheduling options include executing a computing task locally by the user, and offloading the computing task to the UAV or the BS by the user, which are mutually exclusive. Assuming there are sufficient frequency domain channels, offloading transmission terminals will not interfere with each other, and computing results can be returned to the terminal through a dedicated frequency domain channel. Processing devices of each user, the BS, and the UAV each are equipped with a local task queue to buffer unprocessed data. Variable αm,t,P={0, 1}, P∈{UE,UAV,BS} represents operations of performing local computing, offloading to the UAV, and offloading to the BS by the mth user in the timeslot t. When the mth user performs local processing in the timeslot t, αm,t,UE=1; otherwise, αm,t,UE=0. The same applies to the offloading to the UAV and the BS by the mth user in the timeslot t. At the end of timeslot t−1, an amount of unprocessed data in the task queue of the mth user is Dm,t−1. Similarly, at the end of the timeslot t−1, quantities of unprocessed task bits in the task queues of the UAV and the terrestrial BS can be represented as DBS,t−1 and DUAV,t−1 respectively.
The following analyzes changes in a quantity of processed task bits and an energy consumption in the queue in the timeslot t in both local computing and offloading scenarios.
When the mth user performs the local computing in the timeslot t, in other words, αm,t,UE=1, the task is not offloaded to the UAV or the terrestrial BS, and only an energy consumption of the local computing is generated. Therefore, the energy consumption of the local computing performed by the mth user in the timeslot t is EUE,m,tcp. At the end of the timeslot t, an amount of unprocessed data in the task queue of the mth user is represented by Dm,t.
When the mth user performs the offloading operation in the timeslot t, the mth terrestrial user offloads the task to an edge computing server located on the UAV or the terrestrial BS in one timeslot. During transmission, a transmission energy consumption is generated. Therefore, a transmission energy consumption of offloading the task by the mth user to the UAV in the timeslot t can be expressed as EUAV,m,ttrans. A transmission energy consumption of offloading the task by the mth user to the terrestrial BS in the timeslot t can be expressed as EBS,m,ttrans. When the task is computed on the UAV or the terrestrial BS, an energy consumption of computing the task is generated. Therefore, an energy consumption of computing the task by the mth user on the UAV in the timeslot t can be expressed as EUAV,m,tcp. An energy consumption of computing the task by the mth user on the terrestrial BS in the timeslot t can be expressed as EBS,m,tcp. Similarly, at the end of the timeslot t, quantities of unprocessed task bits in the task queues of the UAV and the terrestrial BS can be represented as DUAV,t and DBS,t respectively.
Therefore, a total transmission and computing energy consumption of all devices (the terrestrial user, the UAV, and the terrestrial BS) in the system in the timeslot t is as follows:
A total amount of unprocessed data in queues of all the devices (the terrestrial user, the UAV, and the terrestrial BS) in the system in the timeslot t is as follows:
In the system, negative values of the total energy consumption and the total amount of the unprocessed data of all the devices in the current system are respectively represented as et=−Et and dt=−Dt in this embodiment.
This embodiment achieves dynamic task offloading scheduling of the terrestrial user and trajectory planning of the UAV by minimizing the total energy consumption and the total amount of the unprocessed data of all the devices in the current system. Intuitively, in order to determine a flight direction of the UAV or an offloading decision of the mth user in timeslot t+1, the UAV or the mth user must rely on an observed state, in other words, a location of the UAV in the timeslot t, and a total amount of unprocessed data of all the devices in previous timeslot t−1. Therefore, dynamic offloading scheduling and trajectory planning problems have become independent Markov process decision-making problems. In order to avoid a curse of dimensionality, a distributed multi-agent model can be used. In this embodiment, all distributed (terrestrial) users select an action in a distributed and parallel manner, that is, each terrestrial user makes a decision only to determine its own task offloading scheduling strategy, and task offloading scheduling strategies of all the terrestrial users do not interfere with each other.
In the timeslot t, the UAV and each user have same state information, and the UAV and the terrestrial user each make a decision based on the state information. Specifically, the UAV selects the flight direction, and each user selects a task offloading strategy. State st observed by the UAV in the timeslot t can be defined as the location of the UAV in the timeslot t and the total amount of the unprocessed data of all the devices in the previous timeslot t−1, in other words, st=[qUAV,t,dt−1]. For the UAV, in the given state st, the flight direction determined in the timeslot t can be defined as at,UAV∈AUAV. That is, the UAV selects a flight direction from preset direction set AUAV (such as {up, down, left, right, upper right, lower right, upper left, and lower left}). The location of the UAV remains unchanged until at,UAV is executed at the end of the timeslot t. For the mth terrestrial user, in the given state st, an offloading scheduling action determined by the mth terrestrial user in the timeslot t can be defined as at,m∈Am. That is, the mth terrestrial user determines task offloading scheduling method at,m for the mth terrestrial user in the timeslot t from preset task offloading strategy set Am={local computing, offloading to the UAV, and offloading to the BS}. Each user performs task offloading based on an offloading scheduling decision, and then obtains the negative value et of the total energy consumption of all the devices in the system in the timeslot t, as well as the negative value dt of the total amount of the unprocessed data of all the devices in the system in the timeslot t.
Therefore, this embodiment also defines a feedback of an environment for an action in the timeslot t as vector rt=[et, dt] to evaluate an overall energy consumption and an overall task bit backlog. In order to improve an expected long-term average energy efficiency and data processing capability, this embodiment provides a concept of an average feedback. For an average feedback Ē of the energy consumption and an average feedback
and
). After that, obtained average values are multiplied by an upper bound (namely
) of 1/T to obtain limit values, which can be obtained from
and
respectively. Further, the limit values are collected in a column vector, which is represented as
In a simulation setting, the flight direction set of the UAV includes eight basic directions, namely A0=[up; down; left; upper left; upper right; lower left; and lower right], and a fixed altitude is set to H=100 m. In addition, a quantity of terrestrial users is set to M=5, and the weight vector wr is set to we=1 and wd=1. The simulations are conducted by MATLAB R2020a on a single computer, with an Intel Core i7 processor at 3.6 GHz, a RAM of 16 GB and the Windows 10 operating system.
In addition, it is worth noting that the UAV does not fly back and forth between terrestrial users to provide a data offloading service for the user, as this will reduce a channel gain between a user with a high data volume and the UAV, resulting in a high task bit backlog and a high computation energy consumption. Therefore, as shown in
As shown in
The UAV is configured to perform following operations: obtaining current state information, where the state information includes a location of the UAV in current timeslot t and a total amount of unprocessed data of all the devices in previous timeslot t−1; selecting an action by minimizing a total energy consumption of all the devices and a total amount of unprocessed data of all the devices, selecting a flight direction from a preset flight direction set (including up, down, left, right, upper right, lower right, upper left, and lower left), and flying a fixed distance; after task offloading scheduling of the users is completed in the current timeslot t, updating the location as a location in next timeslot t+1; and receiving a feedback for a current action, and obtaining an optimal flying strategy based on the state information and the feedback for the current action in the next timeslot.
The user is configured to perform following operations: before the UAV reaches a new location, obtaining, by each user, current state information; selecting, by all the users, an action in parallel, determining a task offloading scheduling method (including local computing, offloading to the UAV, and offloading to the BS) of each user from a preset offloading strategy set by minimizing the total energy consumption of all the devices and the total amount of the unprocessed data of all the devices, and executing the task offloading scheduling method in the current timeslot t; and receiving the feedback for the current action, and obtaining an optimal offloading strategy based on the state information and the feedback for the current action in the next timeslot.
The total energy consumption of all the devices is calculated according to a following formula:
where Dt represents a total energy consumption of all the devices in the timeslot t, M represents a quantity of users, m∈{1, 2, . . . , M} , and Dm,t, DUAV,t, and DBS,t respectively represent cumulative computing tasks in queues of user m, the UAV, and the BS in the timeslot t.
The total amount of the unprocessed data of all the devices is calculated according to a following formula:
where Et represents a total amount of unprocessed data of all the devices in the timeslot t, EUAV,m,ttrans and EBS,m,ttrans respectively represent transmission energy consumptions of offloading a task by the user m to the UAV and the BS in the timeslot t, EUAV,m,tcp and EBS,m,tcp respectively represent energy consumptions of computing the task by the user m on the UAV and the BS in the timeslot t, and αm,t,UE, αm,t,UAV, and αm,t,BS respectively represent operations of performing local computing, offloading to the UAV, and offloading to the BS by the user m in the timeslot t, where when the local computing is performed, αm,t,UE=1, otherwise αm,t,UE=0; when the offloading to the UAV is performed, αm,t,UAV=1, otherwise αm,t,UAV=0; and when the offloading to the BS is performed, αm,t,BS=1, otherwise αm,t,BS=0.
The feedback for the current action is as follows:
where
represents an upper bound of
, ε{ } represents averaging, et=−Et, and dt=−Dt.
The optimal flying and offloading strategy is determined according to a following formula:
π=argmaxπ{wrT
wr=[we, wd]T
where π represents the optimal flying and offloading strategy, wr represents a weight vector, wrT represents transposition of wr, and we and wd respectively represent weight values for the energy consumption and an amount of the unprocessed data.
What is described above is merely the preferred implementations of the present disclosure, the scope of protection of the present disclosure is not limited to the above embodiments, and all technical solutions following the idea of the present disclosure fall within the scope of protection of the present disclosure. It should be noted that several modifications and adaptations made by those of ordinary skill in the art without departing from the principle of the present disclosure should fall within the scope of protection of the present disclosure.
Number | Date | Country | Kind |
---|---|---|---|
202310813910.0 | Jul 2023 | CN | national |
Number | Name | Date | Kind |
---|---|---|---|
20210266834 | Hu | Aug 2021 | A1 |
20230026782 | Sha et al. | Jan 2023 | A1 |
Number | Date | Country |
---|---|---|
110207712 | Sep 2019 | CN |
112351503 | Feb 2021 | CN |
113286314 | Aug 2021 | CN |
114599102 | Jun 2022 | CN |
115454527 | Dec 2022 | CN |
115640131 | Jan 2023 | CN |
2021003709 | Jan 2021 | WO |
Entry |
---|
S. Du, X. Chen, L. Jiao and Y. Lu, “Energy Efficient Task Offloading for UAV-assisted Mobile Edge Computing,” 2021 China Automation Congress (CAC), Beijing, China, 2021, pp. 6567-6657 (Year: 2021). |