Multi-modal gesture recognition challenge 2013: Dataset and results

Sergio Escalera; Jordi Gonzàlez; Xavier Baró; Miguel Reyes; Oscar Lopes; Isabelle Guyon; Vassilis Athitsos; Hugo Escalante

doi:10.1145/2522848.2532595

Multi-modal gesture recognition challenge 2013: Dataset and results

Sergio Escalera, Jordi Gonzàlez, Xavier Baró, Miguel Reyes, Oscar Lopes, Isabelle Guyon, Vassilis Athitsos, Hugo Escalante

Publikation: Bidrag til bog/antologi/rapport/konference proceeding › Konferenceartikel i proceeding › Forskning › peer review

156 Citationer (Scopus)

Abstract

The recognition of continuous natural gestures is a complex and challenging problem due to the multi-modal nature of involved visual cues (e.g. fingers and lips movements, subtle facial expressions, body pose, etc.), as well as technical limitations such as spatial and temporal resolution and unreliable depth cues. In order to promote the research advance on this field, we organized a challenge on multi-modal gesture recognition. We made available a large video database of 13,858 gestures from a lexicon of 20 Italian gesture categories recorded with a Kinect camera, providing the audio, skeletal model, user mask, RGB and depth images. The focus of the challenge was on user independent multiple gesture learning. There are no resting positions and the gestures are performed in continuous sequences lasting 1-2 minutes, containing between 8 and 20 gesture instances in each sequence. As a result, the dataset contains around 1.720.800 frames. In addition to the 20 main gesture categories, "distracter" gestures are included, meaning that additional audio and gestures out of the vocabulary are included. The final evaluation of the challenge was defined in terms of the Levenshtein edit distance, where the goal was to indicate the real order of gestures within the sequence. 54 international teams participated in the challenge, and outstanding results were obtained by the first ranked participants.

Originalsprog	Engelsk
Titel	ICMI 2013 - Proceedings of the 2013 ACM International Conference on Multimodal Interaction
Antal sider	8
Publikationsdato	2013
Sider	445-452
ISBN (Trykt)	9781450321297
DOI	https://doi.org/10.1145/2522848.2532595
Status	Udgivet - 2013
Udgivet eksternt	Ja
Begivenhed	2013 15th ACM International Conference on Multimodal Interaction, ICMI 2013 - Sydney, NSW, Australien Varighed: 9 dec. 2013 → 13 dec. 2013

Konference

Konference	2013 15th ACM International Conference on Multimodal Interaction, ICMI 2013
Land/Område	Australien
By	Sydney, NSW
Periode	09/12/2013 → 13/12/2013
Sponsor	ACM SIGCHI

Navn	ICMI 2013 - Proceedings of the 2013 ACM International Conference on Multimodal Interaction

Adgang til dokumentet

10.1145/2522848.2532595

AUB Link

Søg efter materialet i Aalborg Universitetsbiblioteks søgemaskine

Andre filer og links

Link to publication in Scopus

Citationsformater

@inproceedings{8c99f756552e48099a61b270c3566b80,

title = "Multi-modal gesture recognition challenge 2013: Dataset and results",

abstract = "The recognition of continuous natural gestures is a complex and challenging problem due to the multi-modal nature of involved visual cues (e.g. fingers and lips movements, subtle facial expressions, body pose, etc.), as well as technical limitations such as spatial and temporal resolution and unreliable depth cues. In order to promote the research advance on this field, we organized a challenge on multi-modal gesture recognition. We made available a large video database of 13,858 gestures from a lexicon of 20 Italian gesture categories recorded with a Kinect camera, providing the audio, skeletal model, user mask, RGB and depth images. The focus of the challenge was on user independent multiple gesture learning. There are no resting positions and the gestures are performed in continuous sequences lasting 1-2 minutes, containing between 8 and 20 gesture instances in each sequence. As a result, the dataset contains around 1.720.800 frames. In addition to the 20 main gesture categories, {"}distracter{"} gestures are included, meaning that additional audio and gestures out of the vocabulary are included. The final evaluation of the challenge was defined in terms of the Levenshtein edit distance, where the goal was to indicate the real order of gestures within the sequence. 54 international teams participated in the challenge, and outstanding results were obtained by the first ranked participants.",

keywords = "computer vision, gesture recognition, multi-modal data analysis",

author = "Sergio Escalera and Jordi Gonz{\`a}lez and Xavier Bar{\'o} and Miguel Reyes and Oscar Lopes and Isabelle Guyon and Vassilis Athitsos and Hugo Escalante",

year = "2013",

doi = "10.1145/2522848.2532595",

language = "English",

isbn = "9781450321297",

series = "ICMI 2013 - Proceedings of the 2013 ACM International Conference on Multimodal Interaction",

pages = "445--452",

booktitle = "ICMI 2013 - Proceedings of the 2013 ACM International Conference on Multimodal Interaction",

note = "2013 15th ACM International Conference on Multimodal Interaction, ICMI 2013 ; Conference date: 09-12-2013 Through 13-12-2013",

}

Escalera, S, Gonzàlez, J, Baró, X, Reyes, M, Lopes, O, Guyon, I, Athitsos, V & Escalante, H 2013, Multi-modal gesture recognition challenge 2013: Dataset and results. i ICMI 2013 - Proceedings of the 2013 ACM International Conference on Multimodal Interaction. ICMI 2013 - Proceedings of the 2013 ACM International Conference on Multimodal Interaction, s. 445-452, 2013 15th ACM International Conference on Multimodal Interaction, ICMI 2013, Sydney, NSW, Australien, 09/12/2013. https://doi.org/10.1145/2522848.2532595

Multi-modal gesture recognition challenge 2013: Dataset and results. / Escalera, Sergio; Gonzàlez, Jordi; Baró, Xavier et al.
ICMI 2013 - Proceedings of the 2013 ACM International Conference on Multimodal Interaction. 2013. s. 445-452 (ICMI 2013 - Proceedings of the 2013 ACM International Conference on Multimodal Interaction).

Publikation: Bidrag til bog/antologi/rapport/konference proceeding › Konferenceartikel i proceeding › Forskning › peer review

TY - GEN

T1 - Multi-modal gesture recognition challenge 2013

T2 - 2013 15th ACM International Conference on Multimodal Interaction, ICMI 2013

AU - Escalera, Sergio

AU - Gonzàlez, Jordi

AU - Baró, Xavier

AU - Reyes, Miguel

AU - Lopes, Oscar

AU - Guyon, Isabelle

AU - Athitsos, Vassilis

AU - Escalante, Hugo

PY - 2013

Y1 - 2013

N2 - The recognition of continuous natural gestures is a complex and challenging problem due to the multi-modal nature of involved visual cues (e.g. fingers and lips movements, subtle facial expressions, body pose, etc.), as well as technical limitations such as spatial and temporal resolution and unreliable depth cues. In order to promote the research advance on this field, we organized a challenge on multi-modal gesture recognition. We made available a large video database of 13,858 gestures from a lexicon of 20 Italian gesture categories recorded with a Kinect camera, providing the audio, skeletal model, user mask, RGB and depth images. The focus of the challenge was on user independent multiple gesture learning. There are no resting positions and the gestures are performed in continuous sequences lasting 1-2 minutes, containing between 8 and 20 gesture instances in each sequence. As a result, the dataset contains around 1.720.800 frames. In addition to the 20 main gesture categories, "distracter" gestures are included, meaning that additional audio and gestures out of the vocabulary are included. The final evaluation of the challenge was defined in terms of the Levenshtein edit distance, where the goal was to indicate the real order of gestures within the sequence. 54 international teams participated in the challenge, and outstanding results were obtained by the first ranked participants.

AB - The recognition of continuous natural gestures is a complex and challenging problem due to the multi-modal nature of involved visual cues (e.g. fingers and lips movements, subtle facial expressions, body pose, etc.), as well as technical limitations such as spatial and temporal resolution and unreliable depth cues. In order to promote the research advance on this field, we organized a challenge on multi-modal gesture recognition. We made available a large video database of 13,858 gestures from a lexicon of 20 Italian gesture categories recorded with a Kinect camera, providing the audio, skeletal model, user mask, RGB and depth images. The focus of the challenge was on user independent multiple gesture learning. There are no resting positions and the gestures are performed in continuous sequences lasting 1-2 minutes, containing between 8 and 20 gesture instances in each sequence. As a result, the dataset contains around 1.720.800 frames. In addition to the 20 main gesture categories, "distracter" gestures are included, meaning that additional audio and gestures out of the vocabulary are included. The final evaluation of the challenge was defined in terms of the Levenshtein edit distance, where the goal was to indicate the real order of gestures within the sequence. 54 international teams participated in the challenge, and outstanding results were obtained by the first ranked participants.

KW - computer vision

KW - gesture recognition

KW - multi-modal data analysis

UR - http://www.scopus.com/inward/record.url?scp=84892583619&partnerID=8YFLogxK

U2 - 10.1145/2522848.2532595

DO - 10.1145/2522848.2532595

M3 - Article in proceeding

AN - SCOPUS:84892583619

SN - 9781450321297

T3 - ICMI 2013 - Proceedings of the 2013 ACM International Conference on Multimodal Interaction

SP - 445

EP - 452

BT - ICMI 2013 - Proceedings of the 2013 ACM International Conference on Multimodal Interaction

Y2 - 9 December 2013 through 13 December 2013

ER -

Escalera S, Gonzàlez J, Baró X, Reyes M, Lopes O, Guyon I et al. Multi-modal gesture recognition challenge 2013: Dataset and results. I ICMI 2013 - Proceedings of the 2013 ACM International Conference on Multimodal Interaction. 2013. s. 445-452. (ICMI 2013 - Proceedings of the 2013 ACM International Conference on Multimodal Interaction). doi: 10.1145/2522848.2532595

Multi-modal gesture recognition challenge 2013: Dataset and results

Abstract

Konference

Adgang til dokumentet

AUB Link

Andre filer og links

Fingeraftryk

Citationsformater