Improving Robustness against Environmental Sounds for Directing Attention of Social Robots

Research output: Contribution to book/anthology/report/conference proceedingArticle in proceedingResearchpeer-review

2 Citations (Scopus)

Abstract

This paper presents a multi-modal system for finding out where to direct the attention of a social robot in a dialog scenario, which is robust against environmental sounds (door slamming, phone ringing etc.) and short speech segments. The method is based on combining voice activity detection (VAD) and sound source localization (SSL) and furthermore apply post-processing to SSL to filter out short sounds. The system is tested against a baseline system in four different real-world experiments, where different sounds are used as interfering sounds. The results are promising and show a clear improvement.
Original languageEnglish
Title of host publicationMultimodal Analyses enabling Artificial Agents in Human-Machine Interaction
Number of pages10
PublisherSpringer Publishing Company
Publication date2015
Pages25-34
ISBN (Print)978-3-319-15556-2
ISBN (Electronic)978-3-319-15557-9
DOIs
Publication statusPublished - 2015
Event2014 2nd Multimodal Analyses Enabling Artificial Agents in Human-Machine Interaction - Singapore, Singapore
Duration: 14 Sep 201414 Sep 2014

Conference

Conference2014 2nd Multimodal Analyses Enabling Artificial Agents in Human-Machine Interaction
CountrySingapore
CitySingapore
Period14/09/201414/09/2014
SeriesLecture Notes in Computer Science
ISSN0302-9743

Fingerprint

Acoustic waves
Robots
Processing
Experiments

Keywords

  • Multi-modal tracking
  • human-computer interaction
  • sound-source localization

Cite this

Thomsen, N. B., Tan, Z-H., Lindberg, B., & Jensen, S. H. (2015). Improving Robustness against Environmental Sounds for Directing Attention of Social Robots. In Multimodal Analyses enabling Artificial Agents in Human-Machine Interaction (pp. 25-34). Springer Publishing Company. Lecture Notes in Computer Science https://doi.org/10.1007/978-3-319-15557-9_3
Thomsen, Nicolai Bæk ; Tan, Zheng-Hua ; Lindberg, Børge ; Jensen, Søren Holdt. / Improving Robustness against Environmental Sounds for Directing Attention of Social Robots. Multimodal Analyses enabling Artificial Agents in Human-Machine Interaction. Springer Publishing Company, 2015. pp. 25-34 (Lecture Notes in Computer Science).
@inproceedings{e268c125d424423a9bda1b1c6c3ce50d,
title = "Improving Robustness against Environmental Sounds for Directing Attention of Social Robots",
abstract = "This paper presents a multi-modal system for finding out where to direct the attention of a social robot in a dialog scenario, which is robust against environmental sounds (door slamming, phone ringing etc.) and short speech segments. The method is based on combining voice activity detection (VAD) and sound source localization (SSL) and furthermore apply post-processing to SSL to filter out short sounds. The system is tested against a baseline system in four different real-world experiments, where different sounds are used as interfering sounds. The results are promising and show a clear improvement.",
keywords = "Multi-modal tracking, human-computer interaction, sound-source localization",
author = "Thomsen, {Nicolai B{\ae}k} and Zheng-Hua Tan and B{\o}rge Lindberg and Jensen, {S{\o}ren Holdt}",
year = "2015",
doi = "10.1007/978-3-319-15557-9_3",
language = "English",
isbn = "978-3-319-15556-2",
series = "Lecture Notes in Computer Science",
publisher = "Springer Publishing Company",
pages = "25--34",
booktitle = "Multimodal Analyses enabling Artificial Agents in Human-Machine Interaction",
address = "United States",

}

Thomsen, NB, Tan, Z-H, Lindberg, B & Jensen, SH 2015, Improving Robustness against Environmental Sounds for Directing Attention of Social Robots. in Multimodal Analyses enabling Artificial Agents in Human-Machine Interaction. Springer Publishing Company, Lecture Notes in Computer Science, pp. 25-34, 2014 2nd Multimodal Analyses Enabling Artificial Agents in Human-Machine Interaction, Singapore, Singapore, 14/09/2014. https://doi.org/10.1007/978-3-319-15557-9_3

Improving Robustness against Environmental Sounds for Directing Attention of Social Robots. / Thomsen, Nicolai Bæk; Tan, Zheng-Hua; Lindberg, Børge; Jensen, Søren Holdt.

Multimodal Analyses enabling Artificial Agents in Human-Machine Interaction. Springer Publishing Company, 2015. p. 25-34 (Lecture Notes in Computer Science).

Research output: Contribution to book/anthology/report/conference proceedingArticle in proceedingResearchpeer-review

TY - GEN

T1 - Improving Robustness against Environmental Sounds for Directing Attention of Social Robots

AU - Thomsen, Nicolai Bæk

AU - Tan, Zheng-Hua

AU - Lindberg, Børge

AU - Jensen, Søren Holdt

PY - 2015

Y1 - 2015

N2 - This paper presents a multi-modal system for finding out where to direct the attention of a social robot in a dialog scenario, which is robust against environmental sounds (door slamming, phone ringing etc.) and short speech segments. The method is based on combining voice activity detection (VAD) and sound source localization (SSL) and furthermore apply post-processing to SSL to filter out short sounds. The system is tested against a baseline system in four different real-world experiments, where different sounds are used as interfering sounds. The results are promising and show a clear improvement.

AB - This paper presents a multi-modal system for finding out where to direct the attention of a social robot in a dialog scenario, which is robust against environmental sounds (door slamming, phone ringing etc.) and short speech segments. The method is based on combining voice activity detection (VAD) and sound source localization (SSL) and furthermore apply post-processing to SSL to filter out short sounds. The system is tested against a baseline system in four different real-world experiments, where different sounds are used as interfering sounds. The results are promising and show a clear improvement.

KW - Multi-modal tracking

KW - human-computer interaction

KW - sound-source localization

U2 - 10.1007/978-3-319-15557-9_3

DO - 10.1007/978-3-319-15557-9_3

M3 - Article in proceeding

SN - 978-3-319-15556-2

T3 - Lecture Notes in Computer Science

SP - 25

EP - 34

BT - Multimodal Analyses enabling Artificial Agents in Human-Machine Interaction

PB - Springer Publishing Company

ER -

Thomsen NB, Tan Z-H, Lindberg B, Jensen SH. Improving Robustness against Environmental Sounds for Directing Attention of Social Robots. In Multimodal Analyses enabling Artificial Agents in Human-Machine Interaction. Springer Publishing Company. 2015. p. 25-34. (Lecture Notes in Computer Science). https://doi.org/10.1007/978-3-319-15557-9_3