## Abstrakt

In this work, we optimize the 3D trajectory of an unmanned aerial vehicle (UAV)-based portable access point (PAP) that provides

wireless services to a set of ground nodes (GNs). Moreover, as per the Peukert effect, we consider pragmatic non-linear battery

discharge for UAV’s battery. Thus, we formulate the problem in a novel manner that represents the maximization of a fairness-based

energy efficiency metric and is named fair energy efficiency (FEE). The FEE metric defines a system that lays importance on both

the per-user service fairness and the PAP’s energy efficiency. The formulated problem takes the form of a non-convex problem with

non-tractable constraints. To obtain a solution we represent the problem as a Markov Decision Process (MDP) with continuous state

and action spaces. Considering the complexity of the solution space, we use the twin delayed deep deterministic policy gradient (TD3)

actor-critic deep reinforcement learning (DRL) framework to learn a policy that maximizes the FEE of the system. We perform two

types of RL training to exhibit the effectiveness of our approach: the first (offline) approach keeps the positions of the GNs the same

throughout the training phase; the second approach generalizes the learned policy to any arrangement of GNs by changing the

positions of GNs after each training episode. Numerical evaluations show that neglecting the Peukert effect overestimates the air-time

of the PAP and can be addressed by optimally selecting the PAP’s flying speed. Moreover, the user fairness, energy efficiency, and

hence the FEE value of the system can be improved by efficiently moving the PAP above the GNs. As such, we notice massive FEE

improvements over baseline scenarios of up to 88.31%, 272.34%, and 318.13% for suburban, urban, and dense urban environments,

respectively.

wireless services to a set of ground nodes (GNs). Moreover, as per the Peukert effect, we consider pragmatic non-linear battery

discharge for UAV’s battery. Thus, we formulate the problem in a novel manner that represents the maximization of a fairness-based

energy efficiency metric and is named fair energy efficiency (FEE). The FEE metric defines a system that lays importance on both

the per-user service fairness and the PAP’s energy efficiency. The formulated problem takes the form of a non-convex problem with

non-tractable constraints. To obtain a solution we represent the problem as a Markov Decision Process (MDP) with continuous state

and action spaces. Considering the complexity of the solution space, we use the twin delayed deep deterministic policy gradient (TD3)

actor-critic deep reinforcement learning (DRL) framework to learn a policy that maximizes the FEE of the system. We perform two

types of RL training to exhibit the effectiveness of our approach: the first (offline) approach keeps the positions of the GNs the same

throughout the training phase; the second approach generalizes the learned policy to any arrangement of GNs by changing the

positions of GNs after each training episode. Numerical evaluations show that neglecting the Peukert effect overestimates the air-time

of the PAP and can be addressed by optimally selecting the PAP’s flying speed. Moreover, the user fairness, energy efficiency, and

hence the FEE value of the system can be improved by efficiently moving the PAP above the GNs. As such, we notice massive FEE

improvements over baseline scenarios of up to 88.31%, 272.34%, and 318.13% for suburban, urban, and dense urban environments,

respectively.

Originalsprog | Engelsk |
---|---|

Tidsskrift | IEEE Open Journal of the Communications Society |

Vol/bind | 3 |

Sider (fra-til) | 1487-1500 |

Antal sider | 14 |

ISSN | 2644-125X |

DOI | |

Status | Udgivet - sep. 2022 |