Improving Robustness against Environmental Sounds for Directing Attention of Social Robots

Nicolai Bæk Thomsen; Zheng-Hua Tan; Børge Lindberg; Søren Holdt Jensen

doi:10.1007/978-3-319-15557-9_3

Improving Robustness against Environmental Sounds for Directing Attention of Social Robots

Nicolai Bæk Thomsen, Zheng-Hua Tan, Børge Lindberg, Søren Holdt Jensen

Department of Electronic Systems

Research output: Contribution to book/anthology/report/conference proceeding › Article in proceeding › Research › peer-review

2 Citations (Scopus)

Abstract

This paper presents a multi-modal system for finding out where to direct the attention of a social robot in a dialog scenario, which is robust against environmental sounds (door slamming, phone ringing etc.) and short speech segments. The method is based on combining voice activity detection (VAD) and sound source localization (SSL) and furthermore apply post-processing to SSL to filter out short sounds. The system is tested against a baseline system in four different real-world experiments, where different sounds are used as interfering sounds. The results are promising and show a clear improvement.

Original language	English
Title of host publication	Multimodal Analyses enabling Artificial Agents in Human-Machine Interaction
Number of pages	10
Publisher	Springer Publishing Company
Publication date	2015
Pages	25-34
ISBN (Print)	978-3-319-15556-2
ISBN (Electronic)	978-3-319-15557-9
DOIs	https://doi.org/10.1007/978-3-319-15557-9_3
Publication status	Published - 2015
Event	2014 2nd Multimodal Analyses Enabling Artificial Agents in Human-Machine Interaction - Singapore, Singapore Duration: 14 Sept 2014 → 14 Sept 2014

Conference

Conference	2014 2nd Multimodal Analyses Enabling Artificial Agents in Human-Machine Interaction
Country/Territory	Singapore
City	Singapore
Period	14/09/2014 → 14/09/2014

Series	Lecture Notes in Computer Science
ISSN	0302-9743

Keywords

Multi-modal tracking
human-computer interaction
sound-source localization

Access to Document

10.1007/978-3-319-15557-9_3

http://www.springer.com/computer/ai/book/978-3-319-15556-2

AUB Link

Search for the material in Aalborg University Library's search engine

iSocioBot: Durable Interaction with Socially Intelligent Robots
Tan, Z., Jensen, S. H., Lindberg, B. & Thomsen, N. B.
01/08/2013 → 31/12/2017
Project: Research

Cite this

@inproceedings{e268c125d424423a9bda1b1c6c3ce50d,

title = "Improving Robustness against Environmental Sounds for Directing Attention of Social Robots",

abstract = "This paper presents a multi-modal system for finding out where to direct the attention of a social robot in a dialog scenario, which is robust against environmental sounds (door slamming, phone ringing etc.) and short speech segments. The method is based on combining voice activity detection (VAD) and sound source localization (SSL) and furthermore apply post-processing to SSL to filter out short sounds. The system is tested against a baseline system in four different real-world experiments, where different sounds are used as interfering sounds. The results are promising and show a clear improvement.",

keywords = "Multi-modal tracking, human-computer interaction, sound-source localization",

author = "Thomsen, {Nicolai B{\ae}k} and Zheng-Hua Tan and B{\o}rge Lindberg and Jensen, {S{\o}ren Holdt}",

year = "2015",

doi = "10.1007/978-3-319-15557-9_3",

language = "English",

isbn = "978-3-319-15556-2",

series = "Lecture Notes in Computer Science",

publisher = "Springer Publishing Company",

pages = "25--34",

booktitle = "Multimodal Analyses enabling Artificial Agents in Human-Machine Interaction",

address = "United States",

note = "2014 2nd Multimodal Analyses Enabling Artificial Agents in Human-Machine Interaction, MA3HMI ; Conference date: 14-09-2014 Through 14-09-2014",

}

Thomsen, NB, Tan, Z-H, Lindberg, B & Jensen, SH 2015, Improving Robustness against Environmental Sounds for Directing Attention of Social Robots. in Multimodal Analyses enabling Artificial Agents in Human-Machine Interaction. Springer Publishing Company, Lecture Notes in Computer Science, pp. 25-34, 2014 2nd Multimodal Analyses Enabling Artificial Agents in Human-Machine Interaction, Singapore, Singapore, 14/09/2014. https://doi.org/10.1007/978-3-319-15557-9_3

Improving Robustness against Environmental Sounds for Directing Attention of Social Robots. / Thomsen, Nicolai Bæk; Tan, Zheng-Hua; Lindberg, Børge et al.
Multimodal Analyses enabling Artificial Agents in Human-Machine Interaction. Springer Publishing Company, 2015. p. 25-34 (Lecture Notes in Computer Science).

Research output: Contribution to book/anthology/report/conference proceeding › Article in proceeding › Research › peer-review

TY - GEN

T1 - Improving Robustness against Environmental Sounds for Directing Attention of Social Robots

AU - Thomsen, Nicolai Bæk

AU - Tan, Zheng-Hua

AU - Lindberg, Børge

AU - Jensen, Søren Holdt

PY - 2015

Y1 - 2015

N2 - This paper presents a multi-modal system for finding out where to direct the attention of a social robot in a dialog scenario, which is robust against environmental sounds (door slamming, phone ringing etc.) and short speech segments. The method is based on combining voice activity detection (VAD) and sound source localization (SSL) and furthermore apply post-processing to SSL to filter out short sounds. The system is tested against a baseline system in four different real-world experiments, where different sounds are used as interfering sounds. The results are promising and show a clear improvement.

AB - This paper presents a multi-modal system for finding out where to direct the attention of a social robot in a dialog scenario, which is robust against environmental sounds (door slamming, phone ringing etc.) and short speech segments. The method is based on combining voice activity detection (VAD) and sound source localization (SSL) and furthermore apply post-processing to SSL to filter out short sounds. The system is tested against a baseline system in four different real-world experiments, where different sounds are used as interfering sounds. The results are promising and show a clear improvement.

KW - Multi-modal tracking

KW - human-computer interaction

KW - sound-source localization

U2 - 10.1007/978-3-319-15557-9_3

DO - 10.1007/978-3-319-15557-9_3

M3 - Article in proceeding

SN - 978-3-319-15556-2

T3 - Lecture Notes in Computer Science

SP - 25

EP - 34

BT - Multimodal Analyses enabling Artificial Agents in Human-Machine Interaction

PB - Springer Publishing Company

T2 - 2014 2nd Multimodal Analyses Enabling Artificial Agents in Human-Machine Interaction

Y2 - 14 September 2014 through 14 September 2014

ER -

Improving Robustness against Environmental Sounds for Directing Attention of Social Robots

Abstract

Conference

Keywords

Access to Document

AUB Link

Fingerprint

Projects

iSocioBot: Durable Interaction with Socially Intelligent Robots

Cite this