Sound source localization and speech enhancement with sparse Bayesian learning beamforming

Angeliki Xenaki; Jesper Bünsow Boldt; Mads Græsbøll Christensen

doi:10.1121/1.5042222

Sound source localization and speech enhancement with sparse Bayesian learning beamforming

Angeliki Xenaki, Jesper Bünsow Boldt, Mads Græsbøll Christensen

Research output: Contribution to journal › Journal article › Research › peer-review

56 Citations (Scopus)

392 Downloads (Pure)

Abstract

Speech localization and enhancement involves sound source mapping and reconstruction from noisy recordings of speech mixtures with microphone arrays. Conventional beamforming methods suffer from low resolution, especially with a limited number of microphones. In practice, there are only a few sources compared to the possible directions-of-arrival (DOA). Hence, DOA estimation is formulated as a sparse signal reconstruction problem and solved with sparse Bayesian learning (SBL). SBL uses a hierarchical two-level Bayesian inference to reconstruct sparse estimates from a small set of observations. The first level derives the posterior probability of the complex source amplitudes from the data likelihood and the prior. The second level tunes the prior towards sparse solutions with hyperparameters which maximize the evidence, i.e., the data probability. The adaptive learning of the hyperparameters from the data auto-regularizes the inference problem towards sparse robust estimates. Simulations and experimental data demonstrate that SBL beamforming provides high-resolution DOA maps outperforming traditional methods especially for correlated or non-stationary signals. Specifically for speech signals, the high-resolution SBL reconstruction offers not only speech enhancement but effectively speech separation.

Original language	English
Journal	The Journal of the Acoustical Society of America
Volume	143
Issue number	6
Pages (from-to)	3912-3921
Number of pages	10
ISSN	0001-4966
DOIs	https://doi.org/10.1121/1.5042222
Publication status	Published - Jun 2018

Access to Document

10.1121/1.5042222

pdf_archiveJASMANvol_143iss_63912_1Final published version, 1.81 MBLicence: Unspecified

AUB Link

Search for the material in Aalborg University Library's search engine

Cite this

@article{a48e04a0c1154f68871fbfc033181441,

title = "Sound source localization and speech enhancement with sparse Bayesian learning beamforming",

abstract = "Speech localization and enhancement involves sound source mapping and reconstruction from noisy recordings of speech mixtures with microphone arrays. Conventional beamforming methods suffer from low resolution, especially with a limited number of microphones. In practice, there are only a few sources compared to the possible directions-of-arrival (DOA). Hence, DOA estimation is formulated as a sparse signal reconstruction problem and solved with sparse Bayesian learning (SBL). SBL uses a hierarchical two-level Bayesian inference to reconstruct sparse estimates from a small set of observations. The first level derives the posterior probability of the complex source amplitudes from the data likelihood and the prior. The second level tunes the prior towards sparse solutions with hyperparameters which maximize the evidence, i.e., the data probability. The adaptive learning of the hyperparameters from the data auto-regularizes the inference problem towards sparse robust estimates. Simulations and experimental data demonstrate that SBL beamforming provides high-resolution DOA maps outperforming traditional methods especially for correlated or non-stationary signals. Specifically for speech signals, the high-resolution SBL reconstruction offers not only speech enhancement but effectively speech separation.",

author = "Angeliki Xenaki and Boldt, {Jesper B{\"u}nsow} and Christensen, {Mads Gr{\ae}sb{\o}ll}",

year = "2018",

month = jun,

doi = "10.1121/1.5042222",

language = "English",

volume = "143",

pages = "3912--3921",

journal = "The Journal of the Acoustical Society of America",

issn = "0001-4966",

publisher = "A I P Publishing LLC",

number = "6",

}

TY - JOUR

T1 - Sound source localization and speech enhancement with sparse Bayesian learning beamforming

AU - Xenaki, Angeliki

AU - Boldt, Jesper Bünsow

AU - Christensen, Mads Græsbøll

PY - 2018/6

Y1 - 2018/6

N2 - Speech localization and enhancement involves sound source mapping and reconstruction from noisy recordings of speech mixtures with microphone arrays. Conventional beamforming methods suffer from low resolution, especially with a limited number of microphones. In practice, there are only a few sources compared to the possible directions-of-arrival (DOA). Hence, DOA estimation is formulated as a sparse signal reconstruction problem and solved with sparse Bayesian learning (SBL). SBL uses a hierarchical two-level Bayesian inference to reconstruct sparse estimates from a small set of observations. The first level derives the posterior probability of the complex source amplitudes from the data likelihood and the prior. The second level tunes the prior towards sparse solutions with hyperparameters which maximize the evidence, i.e., the data probability. The adaptive learning of the hyperparameters from the data auto-regularizes the inference problem towards sparse robust estimates. Simulations and experimental data demonstrate that SBL beamforming provides high-resolution DOA maps outperforming traditional methods especially for correlated or non-stationary signals. Specifically for speech signals, the high-resolution SBL reconstruction offers not only speech enhancement but effectively speech separation.

AB - Speech localization and enhancement involves sound source mapping and reconstruction from noisy recordings of speech mixtures with microphone arrays. Conventional beamforming methods suffer from low resolution, especially with a limited number of microphones. In practice, there are only a few sources compared to the possible directions-of-arrival (DOA). Hence, DOA estimation is formulated as a sparse signal reconstruction problem and solved with sparse Bayesian learning (SBL). SBL uses a hierarchical two-level Bayesian inference to reconstruct sparse estimates from a small set of observations. The first level derives the posterior probability of the complex source amplitudes from the data likelihood and the prior. The second level tunes the prior towards sparse solutions with hyperparameters which maximize the evidence, i.e., the data probability. The adaptive learning of the hyperparameters from the data auto-regularizes the inference problem towards sparse robust estimates. Simulations and experimental data demonstrate that SBL beamforming provides high-resolution DOA maps outperforming traditional methods especially for correlated or non-stationary signals. Specifically for speech signals, the high-resolution SBL reconstruction offers not only speech enhancement but effectively speech separation.

UR - http://www.scopus.com/inward/record.url?scp=85049388330&partnerID=8YFLogxK

U2 - 10.1121/1.5042222

DO - 10.1121/1.5042222

M3 - Journal article

SN - 0001-4966

VL - 143

SP - 3912

EP - 3921

JO - The Journal of the Acoustical Society of America

JF - The Journal of the Acoustical Society of America

IS - 6

ER -

Sound source localization and speech enhancement with sparse Bayesian learning beamforming

Abstract

Access to Document

AUB Link

Other files and links

Fingerprint

Cite this