Envelope Based Deep Source Separation and EEG Auditory Attention Decoding for Speech and Music

M. Asjid Tanveer*, Jesper Jensen*, Zheng Hua Tan*, Jan Østergaard*

*Corresponding author for this work

Research output: Contribution to book/anthology/report/conference proceedingArticle in proceedingResearchpeer-review

2 Citations (Scopus)

Abstract

This study focuses on auditory attention decoding (AAD) for speech and music. We propose an envelope-based deep source separation strategy on a single microphone system, where the envelope of the mixed audio signal is used to train a deep neural network to separate the mixed signal into the target and distractor envelopes. This reduces the computational complexity and memory requirements, since the envelopes are downsampled to 64 Hz, whereas the original audio is at 48 kHz. The separated envelopes in combination with the EEG signals are then used to train another DNN for AAD. Additionally, we convolve head related transfer functions (HRTF) with the original audio signals to mimic room reverberation in order to test how effective source envelope separation followed by AAD is in a more real world scenario. Our data set consists of 64 channel EEG extracted from subjects, with 4 different scenarios: target speech/distractor music, target speech/distractor speech, target music/distractor speech, and target music/distractor music. We compare the models’ performances across 3, 5, 10, and 20-second time windows. For DNN AAD across both variants (original vs. HRTF) of our data we achieve an accuracy (ACC) of 54.1/51.5, 57.0/54.4, 66.5/63.1 and 80.1/77.5 (%) for each time window respectively and thereby demonstrate the models’ robustness to the effects of HRTFs.

Original languageEnglish
Title of host publication2024 32nd European Signal Processing Conference (EUSIPCO)
Number of pages5
PublisherEuropean Signal Processing Conference, EUSIPCO
Publication date2024
Pages872-876
Article number10715250
ISBN (Electronic)9789464593617
DOIs
Publication statusPublished - 2024
Event32nd European Signal Processing Conference, EUSIPCO 2024 - Lyon, France
Duration: 26 Aug 202430 Aug 2024

Conference

Conference32nd European Signal Processing Conference, EUSIPCO 2024
Country/TerritoryFrance
CityLyon
Period26/08/202430/08/2024
SeriesEuropean Signal Processing Conference
ISSN2219-5491

Bibliographical note

Publisher Copyright:
© 2024 European Signal Processing Conference, EUSIPCO. All rights reserved.

Keywords

  • auditory attention
  • deep learning
  • EEG
  • head related transfer functions
  • source separation
  • stimulus reconstruction

Fingerprint

Dive into the research topics of 'Envelope Based Deep Source Separation and EEG Auditory Attention Decoding for Speech and Music'. Together they form a unique fingerprint.

Cite this