A Vision-Assisted Hearing Aid System Based on Deep Learning

Daniel Michelsanti*, Zheng Hua Tan, Sergi Rotger-Griful, Jesper Jensen

*Corresponding author for this work

Research output: Contribution to book/anthology/report/conference proceedingArticle in proceedingResearchpeer-review

Abstract

Audio-visual speech enhancement (SE) is the task of reducing the acoustic background noise in a degraded speech signal using both acoustic and visual information. In this work, we study how to incorporate visual information to enhance a speech signal using acoustic beamformers in hearing aids (HAs). Specifically, we first trained a deep learning model to estimate a time-frequency mask from audio-visual data. Then, we apply this mask to estimate the inter-microphone power spectral densities (PSDs) of the clean and the noise signal. Finally, we used the estimated PSDs to build acoustic beamformers. Assuming that a HA user wears an add-on device comprising a camera pointing at the target speaker, we show that our method can be beneficial for HA systems especially at low signal to noise ratios (SNRs).

Original languageEnglish
Title of host publicationICASSPW 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing Workshops, Proceedings
PublisherIEEE (Institute of Electrical and Electronics Engineers)
Publication date2023
Article number10193370
ISBN (Electronic)9798350302615
DOIs
Publication statusPublished - 2023
Event2023 IEEE International Conference on Acoustics, Speech and Signal Processing Workshops, ICASSPW 2023 - Rhodes Island, Greece
Duration: 4 Jun 202310 Jun 2023

Conference

Conference2023 IEEE International Conference on Acoustics, Speech and Signal Processing Workshops, ICASSPW 2023
Country/TerritoryGreece
CityRhodes Island
Period04/06/202310/06/2023
SponsorIEEE, IEEE Signal Processing Society

Bibliographical note

Publisher Copyright:
© 2023 IEEE.

Keywords

  • Audio-visual
  • beamforming
  • deep learning
  • hearing aids

Fingerprint

Dive into the research topics of 'A Vision-Assisted Hearing Aid System Based on Deep Learning'. Together they form a unique fingerprint.

Cite this