AAU VAP Trimodal People Segmentation Dataset

Description

Context

How do you design a computer vision algorithm that is able to detect and segment people when they are captured by a visible light camera, a thermal infrared camera, and a depth sensor? And how do you fuse the three inherently different data streams such that you can reliably transfer features from one modality to another? Feel free to download our dataset and try it out yourselves!
Content

The dataset features a total of 5724 annotated frames divided in three indoor scenes.
Activity in scene 1 and 3 is using the full depth range of the Kinect for XBOX 360 sensor whereas activity in scene 2 is constrained to a depth range of plus/minus 0.250 m in order to suppress the parallax between the two physical sensors. Scene 1 and 2 are situated in a closed meeting room with little natural light to disturb the depth sensing, whereas scene 3 is situated in an area with wide windows and a substantial amount of sunlight. For each scene, a total of three persons are interacting, reading, walking, sitting, reading, etc.

Every person is annotated with a unique ID in the scene on a pixel-level in the RGB modality. For the thermal and depth modalities, annotations are transferred from the RGB images using a registration algorithm found in registrator.cpp.

We have used our AAU VAP Multimodal Pixel Annotator to create the ground-truth, pixel-based masks for all three modalities.

Date made available	1 Jan 2017
Publisher	Kaggle
Date of data production	2013

DOI
10.34740/kaggle/dsv/396257

1 Article in proceeding
1 Journal article

Multi-modal RGB–Depth–Thermal Human Body Segmentation
Palmero, C., Clapés, A., Bahnsen, C., Møgelmose, A., Moeslund, T. B. & Escalera, S., 13 Apr 2016, In: International Journal of Computer Vision. 118, 2, p. 217-239
Research output: Contribution to journal › Journal article › Research › peer-review

Open Access
File
66 Citations (Scopus)

1489 Downloads (Pure)
Tri-modal Person Re-identification with RGB, Depth and Thermal Features
Møgelmose, A., Bahnsen, C., Moeslund, T. B., Clapés, A. & Escalera, S., 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW): CVPRW 2013. IEEE, p. 301-307 7 p.
Research output: Contribution to book/anthology/report/conference proceeding › Article in proceeding › Research › peer-review

File
67 Citations (Scopus)

509 Downloads (Pure)

AAU VAP Trimodal People Segmentation Dataset

Description

DOI

Research output

Multi-modal RGB–Depth–Thermal Human Body Segmentation

Tri-modal Person Re-identification with RGB, Depth and Thermal Features

Cite this