Abstract
A harbor traffic monitoring system is necessary for most ports, yet current systems are often not able to detect
and receive information from boats without transponders. In this paper we propose a computer vision based
monitoring system utilizing the multi-modal properties of a PTZ (pan, tilt, zoom) camera with both an optical
and thermal sensor in order to detect boats in different lighting and weather conditions. In both domains boats
are detected using a YOLOv3 network pretrained on the COCO dataset and retrained using transfer-learning
to images of boats in the test environment. The boats are then positioned on the water using ray-casting. The
system is able to detect boats with an average precision of 95.53% and 96.82% in the optical and thermal
domains, respectively. Furthermore, it is also able to detect boats in low optical lighting conditions, without
being trained with data from such conditions, with an average precision of 15.05% and 46.05% in the optical
and thermal domains, respectively. The position estimator, based on a single camera, is able to determine the
position of the boats with a mean error of 18.58 meters and a standard deviation of 17.97 meters.
and receive information from boats without transponders. In this paper we propose a computer vision based
monitoring system utilizing the multi-modal properties of a PTZ (pan, tilt, zoom) camera with both an optical
and thermal sensor in order to detect boats in different lighting and weather conditions. In both domains boats
are detected using a YOLOv3 network pretrained on the COCO dataset and retrained using transfer-learning
to images of boats in the test environment. The boats are then positioned on the water using ray-casting. The
system is able to detect boats with an average precision of 95.53% and 96.82% in the optical and thermal
domains, respectively. Furthermore, it is also able to detect boats in low optical lighting conditions, without
being trained with data from such conditions, with an average precision of 15.05% and 46.05% in the optical
and thermal domains, respectively. The position estimator, based on a single camera, is able to determine the
position of the boats with a mean error of 18.58 meters and a standard deviation of 17.97 meters.
Original language | English |
---|---|
Title of host publication | Proceedings of the 15th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications : Volume 5: VISAPP |
Editors | Giovanni Maria Farinella, Petia Radeva, Jose Braz |
Number of pages | 9 |
Volume | 5 |
Publisher | SCITEPRESS Digital Library |
Publication date | Feb 2020 |
Pages | 395-403 |
ISBN (Electronic) | 978-989-758-402-2 |
DOIs | |
Publication status | Published - Feb 2020 |
Event | 15th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications, VISIGRAPP 2020 - Valletta, Malta Duration: 27 Feb 2020 → 29 Feb 2020 |
Conference
Conference | 15th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications, VISIGRAPP 2020 |
---|---|
Country/Territory | Malta |
City | Valletta |
Period | 27/02/2020 → 29/02/2020 |
Sponsor | Institute for Systems and Technologies of Information, Control and Communication (INSTICC) |
Keywords
- CNN
- Object Detection
- PTZ Camera
- Ray-casting
- Single Camera Positioning
- Thermal Camera
- Transfer-learning