TY - JOUR
T1 - Broadband DOA Estimation using Learning-based Optimal Statistics Estimates
AU - Zhang, Qinzheng
AU - Wang, Haiyan
AU - Jensen, Jesper Rindom
AU - Zhu, Yingying
AU - Tao, Shuai
AU - Christensen, Mads Græsbøll
PY - 2024/12/3
Y1 - 2024/12/3
N2 - For accurate Direction of Arrival (DOA) estimation in challenging high reverberation and low signal-to-noise ratio (SNR) scenarios, various deep learning (DL) techniques have been developed to incorporate and enhance existing algorithms. However, addressing the challenges of enhancing network manipulability, constructing efficient learning features, and integrating algorithms in a rational manner remains a set of significant hurdles. In this paper, we use DL to obtain the Speech Presence Probability (SPP) to construct the optimal statistics estimates, which are then combined with traditional algorithms to achieve accurate DOA estimation. Specifically, we explore the application of the a posteriori SPP in DOA estimation, design a reverberation separation model for practical scenarios, and derive and validate a new computational equation for SPP under this model. Besides, we propose a frequency bin-wise network structure to improve network fitting efficiency and construct input features accordingly. Moreover, by adopting a combined structure, we avoid full-angle network feature training and instead train on partial angles under deliberate subset classification. We then evaluate the DOA estimation performance for the entire direction range with fine resolution using this approach. Simulation results demonstrate that the proposed method requires smaller data sets compared to end-to-end deep learning algorithms. Furthermore, the results validate that the proposed method outperforms both DL-based end-to-end approaches and traditional full-band approaches in terms of accuracy and error rate across various reverberation and signal-to-noise ratio conditions.
AB - For accurate Direction of Arrival (DOA) estimation in challenging high reverberation and low signal-to-noise ratio (SNR) scenarios, various deep learning (DL) techniques have been developed to incorporate and enhance existing algorithms. However, addressing the challenges of enhancing network manipulability, constructing efficient learning features, and integrating algorithms in a rational manner remains a set of significant hurdles. In this paper, we use DL to obtain the Speech Presence Probability (SPP) to construct the optimal statistics estimates, which are then combined with traditional algorithms to achieve accurate DOA estimation. Specifically, we explore the application of the a posteriori SPP in DOA estimation, design a reverberation separation model for practical scenarios, and derive and validate a new computational equation for SPP under this model. Besides, we propose a frequency bin-wise network structure to improve network fitting efficiency and construct input features accordingly. Moreover, by adopting a combined structure, we avoid full-angle network feature training and instead train on partial angles under deliberate subset classification. We then evaluate the DOA estimation performance for the entire direction range with fine resolution using this approach. Simulation results demonstrate that the proposed method requires smaller data sets compared to end-to-end deep learning algorithms. Furthermore, the results validate that the proposed method outperforms both DL-based end-to-end approaches and traditional full-band approaches in terms of accuracy and error rate across various reverberation and signal-to-noise ratio conditions.
KW - Accuracy
KW - Covariance matrices
KW - Deep learning
KW - Direction-of-arrival estimation
KW - Estimation
KW - Feature extraction
KW - Frequency estimation
KW - Reverberation
KW - Sensors
KW - Training
KW - a posteriori SPP
KW - broadband DOA estimation
KW - deep learning
KW - out-of-class task
KW - statistically optimal estimation
UR - http://www.scopus.com/inward/record.url?scp=85211618058&partnerID=8YFLogxK
U2 - 10.1109/JSEN.2024.3505677
DO - 10.1109/JSEN.2024.3505677
M3 - Journal article
SN - 2379-9153
JO - IEEE Sensors Journal
JF - IEEE Sensors Journal
M1 - 10776021
ER -