Dynamic Reward in DQN for Autonomous Navigation of UAVs using Object Detection

Publikation: Bidrag til bog/antologi/rapport/konference proceedingKonferenceartikel i proceedingForskningpeer review

57 Downloads (Pure)

Abstract

This paper discusses the implementation of a Deep
Reinforcement Learning policy, based on DQN, which optimizes
the navigation of the UAV to the front of wind turbine blades.
The UAV was trained in simulation using Unreal Engine V4.27
coupled with AirSim. The action space of the UAV was discretized
while allowing 6 different actions to be executed. A Yolov5
network trained with images of simulated wind turbines was
used for detection and tracking, providing the DQN policy with
state information, upon which it has been trained. In addition to
this, the dynamic reward has been implemented, which combined
both navigation and inspection objectives in the final evaluation
of actions. Our tests showed that after 7500 time-steps the
exploration rate reached near 0, the mean length of the episodes
increased from 10 down to 30, but the mean reward increased
from around -60 to stabilizing the output at 26. These results
suggest that the proposed method is a promising solution to
optimizing the autonomous inspection of wind turbines with
UAVs.
OriginalsprogEngelsk
Titel9th 2023 International Conference on Control, Decision and Information Technologies, CoDIT 2023
Antal sider6
ForlagIEEE
Publikationsdatookt. 2023
Sider2372-2377
Artikelnummer10284087
ISBN (Trykt)979-8-3503-1141-9
ISBN (Elektronisk)979-8-3503-1140-2
DOI
StatusUdgivet - okt. 2023
Begivenhed9th International Conference on Control, Decision and Information Technologies (CoDIT) - Rome, Italien
Varighed: 3 jul. 20236 jul. 2023
https://codit2023.com/

Konference

Konference9th International Conference on Control, Decision and Information Technologies (CoDIT)
Land/OmrådeItalien
ByRome
Periode03/07/202306/07/2023
Internetadresse
NavnInternational Conference on Control, Decision and Information Technologies (CoDIT)
ISSN2576-3555

Fingeraftryk

Dyk ned i forskningsemnerne om 'Dynamic Reward in DQN for Autonomous Navigation of UAVs using Object Detection'. Sammen danner de et unikt fingeraftryk.

Citationsformater