Improving Robustness against Environmental Sounds for Directing Attention of Social Robots

Nicolai Bæk Thomsen, Zheng-Hua Tan, Børge Lindberg, Søren Holdt Jensen

Research output: Contribution to book/anthology/report/conference proceedingArticle in proceedingResearchpeer-review

2 Citations (Scopus)

Abstract

This paper presents a multi-modal system for finding out where to direct the attention of a social robot in a dialog scenario, which is robust against environmental sounds (door slamming, phone ringing etc.) and short speech segments. The method is based on combining voice activity detection (VAD) and sound source localization (SSL) and furthermore apply post-processing to SSL to filter out short sounds. The system is tested against a baseline system in four different real-world experiments, where different sounds are used as interfering sounds. The results are promising and show a clear improvement.
Original languageEnglish
Title of host publicationMultimodal Analyses enabling Artificial Agents in Human-Machine Interaction
Number of pages10
PublisherSpringer Publishing Company
Publication date2015
Pages25-34
ISBN (Print)978-3-319-15556-2
ISBN (Electronic)978-3-319-15557-9
DOIs
Publication statusPublished - 2015
Event2014 2nd Multimodal Analyses Enabling Artificial Agents in Human-Machine Interaction - Singapore, Singapore
Duration: 14 Sept 201414 Sept 2014

Conference

Conference2014 2nd Multimodal Analyses Enabling Artificial Agents in Human-Machine Interaction
Country/TerritorySingapore
CitySingapore
Period14/09/201414/09/2014
SeriesLecture Notes in Computer Science
ISSN0302-9743

Keywords

  • Multi-modal tracking
  • human-computer interaction
  • sound-source localization

Fingerprint

Dive into the research topics of 'Improving Robustness against Environmental Sounds for Directing Attention of Social Robots'. Together they form a unique fingerprint.

Cite this