Abstract
This study focuses on auditory attention decoding (AAD) for speech and music. We propose an envelope-based deep source separation strategy on a single microphone system, where the envelope of the mixed audio signal is used to train a deep neural network to separate the mixed signal into the target and distractor envelopes. This reduces the computational complexity and memory requirements, since the envelopes are downsampled to 64 Hz, whereas the original audio is at 48 kHz. The separated envelopes in combination with the EEG signals are then used to train another DNN for AAD. Additionally, we convolve head related transfer functions (HRTF) with the original audio signals to mimic room reverberation in order to test how effective source envelope separation followed by AAD is in a more real world scenario. Our data set consists of 64 channel EEG extracted from subjects, with 4 different scenarios: target speech/distractor music, target speech/distractor speech, target music/distractor speech, and target music/distractor music. We compare the models’ performances across 3, 5, 10, and 20-second time windows. For DNN AAD across both variants (original vs. HRTF) of our data we achieve an accuracy (ACC) of 54.1/51.5, 57.0/54.4, 66.5/63.1 and 80.1/77.5 (%) for each time window respectively and thereby demonstrate the models’ robustness to the effects of HRTFs.
Original language | English |
---|---|
Title of host publication | 2024 32nd European Signal Processing Conference (EUSIPCO) |
Number of pages | 5 |
Publisher | European Signal Processing Conference, EUSIPCO |
Publication date | 2024 |
Pages | 872-876 |
Article number | 10715250 |
ISBN (Electronic) | 9789464593617 |
DOIs | |
Publication status | Published - 2024 |
Event | 32nd European Signal Processing Conference, EUSIPCO 2024 - Lyon, France Duration: 26 Aug 2024 → 30 Aug 2024 |
Conference
Conference | 32nd European Signal Processing Conference, EUSIPCO 2024 |
---|---|
Country/Territory | France |
City | Lyon |
Period | 26/08/2024 → 30/08/2024 |
Series | European Signal Processing Conference |
---|---|
ISSN | 2219-5491 |
Bibliographical note
Publisher Copyright:© 2024 European Signal Processing Conference, EUSIPCO. All rights reserved.
Keywords
- auditory attention
- deep learning
- EEG
- head related transfer functions
- source separation
- stimulus reconstruction