TY - GEN
T1 - Autoregressive Parameter Estimation with DNN-based Pre-processing
AU - Cui, Zihao
AU - Bao, Changchun
AU - Nielsen, Jesper Kjær
AU - Christensen, Mads Græsbøll
PY - 2020/5
Y1 - 2020/5
N2 - In this paper, a method for estimating the autoregressive parameters from a signal segment is proposed. The method is based on a deep neural network (DNN) in combination with the classical Levinson-Durbin recursion (LDR). The DNN acts as a pre-processor for the LDR and can be trained on different metrics commonly encountered in speech processing using a generalized analysis-by-synthesis (GABS) structure where the LDR acts as the encoder. Unlike end-to-end data-driven approaches, this structure ensures that the DNN is easy to train and initialize since the DNN only has to learn a simple mapping. The results confirm this and show that the proposed method produces an AR-spectrum that efficiently represents the speech spectrum in terms of the Itakura-Saito divergence, Kullback-Leibler divergence, log-spectral distortion, and speech distortion.
AB - In this paper, a method for estimating the autoregressive parameters from a signal segment is proposed. The method is based on a deep neural network (DNN) in combination with the classical Levinson-Durbin recursion (LDR). The DNN acts as a pre-processor for the LDR and can be trained on different metrics commonly encountered in speech processing using a generalized analysis-by-synthesis (GABS) structure where the LDR acts as the encoder. Unlike end-to-end data-driven approaches, this structure ensures that the DNN is easy to train and initialize since the DNN only has to learn a simple mapping. The results confirm this and show that the proposed method produces an AR-spectrum that efficiently represents the speech spectrum in terms of the Itakura-Saito divergence, Kullback-Leibler divergence, log-spectral distortion, and speech distortion.
KW - Auto-regressive model
KW - DNN
KW - Levinson-Durbin recursion
KW - generalized analysis-by-synthesis
UR - http://www.scopus.com/inward/record.url?scp=85089227463&partnerID=8YFLogxK
U2 - 10.1109/ICASSP40776.2020.9053755
DO - 10.1109/ICASSP40776.2020.9053755
M3 - Article in proceeding
SN - 978-1-5090-6632-2
T3 - Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing
SP - 6759
EP - 6763
BT - Proceedings of the International Conference on Acousics, Speech, and Signal Processing
PB - IEEE
T2 - ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
Y2 - 4 May 2020 through 8 May 2020
ER -