Autoregressive Parameter Estimation with DNN-based Pre-processing

Zihao Cui; Changchun Bao; Jesper Kjær Nielsen; Mads Græsbøll Christensen

doi:10.1109/ICASSP40776.2020.9053755

Autoregressive Parameter Estimation with DNN-based Pre-processing

Zihao Cui, Changchun Bao, Jesper Kjær Nielsen, Mads Græsbøll Christensen

Research output: Contribution to book/anthology/report/conference proceeding › Article in proceeding › Research › peer-review

3 Citations (Scopus)

232 Downloads (Pure)

Abstract

In this paper, a method for estimating the autoregressive parameters from a signal segment is proposed. The method is based on a deep neural network (DNN) in combination with the classical Levinson-Durbin recursion (LDR). The DNN acts as a pre-processor for the LDR and can be trained on different metrics commonly encountered in speech processing using a generalized analysis-by-synthesis (GABS) structure where the LDR acts as the encoder. Unlike end-to-end data-driven approaches, this structure ensures that the DNN is easy to train and initialize since the DNN only has to learn a simple mapping. The results confirm this and show that the proposed method produces an AR-spectrum that efficiently represents the speech spectrum in terms of the Itakura-Saito divergence, Kullback-Leibler divergence, log-spectral distortion, and speech distortion.

Original language	English
Title of host publication	Proceedings of the International Conference on Acousics, Speech, and Signal Processing
Number of pages	5
Publisher	IEEE
Publication date	May 2020
Pages	6759-6763
Article number	9053755
ISBN (Print)	978-1-5090-6632-2
ISBN (Electronic)	978-1-5090-6631-5
DOIs	https://doi.org/10.1109/ICASSP40776.2020.9053755
Publication status	Published - May 2020
Event	ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) - Barcelona, Spain Duration: 4 May 2020 → 8 May 2020

Conference

Conference	ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
Country/Territory	Spain
City	Barcelona
Period	04/05/2020 → 08/05/2020

Series	Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing
ISSN	1520-6149

Keywords

Auto-regressive model
DNN
Levinson-Durbin recursion
generalized analysis-by-synthesis

Access to Document

10.1109/ICASSP40776.2020.9053755

DNN_based_Auto_regressive_estimatorAccepted author manuscript, 243 KBLicence: Unspecified

AUB Link

Search for the material in Aalborg University Library's search engine

Cite this

@inproceedings{9f6296a949b44caf843b424a9bc0b281,

title = "Autoregressive Parameter Estimation with DNN-based Pre-processing",

abstract = "In this paper, a method for estimating the autoregressive parameters from a signal segment is proposed. The method is based on a deep neural network (DNN) in combination with the classical Levinson-Durbin recursion (LDR). The DNN acts as a pre-processor for the LDR and can be trained on different metrics commonly encountered in speech processing using a generalized analysis-by-synthesis (GABS) structure where the LDR acts as the encoder. Unlike end-to-end data-driven approaches, this structure ensures that the DNN is easy to train and initialize since the DNN only has to learn a simple mapping. The results confirm this and show that the proposed method produces an AR-spectrum that efficiently represents the speech spectrum in terms of the Itakura-Saito divergence, Kullback-Leibler divergence, log-spectral distortion, and speech distortion.",

keywords = "Auto-regressive model, DNN, Levinson-Durbin recursion, generalized analysis-by-synthesis",

author = "Zihao Cui and Changchun Bao and Nielsen, {Jesper Kj{\ae}r} and Christensen, {Mads Gr{\ae}sb{\o}ll}",

year = "2020",

month = may,

doi = "10.1109/ICASSP40776.2020.9053755",

language = "English",

isbn = "978-1-5090-6632-2",

series = "Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing",

publisher = "IEEE",

pages = "6759--6763",

booktitle = "Proceedings of the International Conference on Acousics, Speech, and Signal Processing",

address = "United States",

note = "ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) ; Conference date: 04-05-2020 Through 08-05-2020",

}

Cui, Z, Bao, C, Nielsen, JK & Christensen, MG 2020, Autoregressive Parameter Estimation with DNN-based Pre-processing. in Proceedings of the International Conference on Acousics, Speech, and Signal Processing., 9053755, IEEE, Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, pp. 6759-6763, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Barcelona, Spain, 04/05/2020. https://doi.org/10.1109/ICASSP40776.2020.9053755

Autoregressive Parameter Estimation with DNN-based Pre-processing. / Cui, Zihao; Bao, Changchun; Nielsen, Jesper Kjær et al.
Proceedings of the International Conference on Acousics, Speech, and Signal Processing. IEEE, 2020. p. 6759-6763 9053755 (Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing).

Research output: Contribution to book/anthology/report/conference proceeding › Article in proceeding › Research › peer-review

TY - GEN

T1 - Autoregressive Parameter Estimation with DNN-based Pre-processing

AU - Cui, Zihao

AU - Bao, Changchun

AU - Nielsen, Jesper Kjær

AU - Christensen, Mads Græsbøll

PY - 2020/5

Y1 - 2020/5

N2 - In this paper, a method for estimating the autoregressive parameters from a signal segment is proposed. The method is based on a deep neural network (DNN) in combination with the classical Levinson-Durbin recursion (LDR). The DNN acts as a pre-processor for the LDR and can be trained on different metrics commonly encountered in speech processing using a generalized analysis-by-synthesis (GABS) structure where the LDR acts as the encoder. Unlike end-to-end data-driven approaches, this structure ensures that the DNN is easy to train and initialize since the DNN only has to learn a simple mapping. The results confirm this and show that the proposed method produces an AR-spectrum that efficiently represents the speech spectrum in terms of the Itakura-Saito divergence, Kullback-Leibler divergence, log-spectral distortion, and speech distortion.

AB - In this paper, a method for estimating the autoregressive parameters from a signal segment is proposed. The method is based on a deep neural network (DNN) in combination with the classical Levinson-Durbin recursion (LDR). The DNN acts as a pre-processor for the LDR and can be trained on different metrics commonly encountered in speech processing using a generalized analysis-by-synthesis (GABS) structure where the LDR acts as the encoder. Unlike end-to-end data-driven approaches, this structure ensures that the DNN is easy to train and initialize since the DNN only has to learn a simple mapping. The results confirm this and show that the proposed method produces an AR-spectrum that efficiently represents the speech spectrum in terms of the Itakura-Saito divergence, Kullback-Leibler divergence, log-spectral distortion, and speech distortion.

KW - Auto-regressive model

KW - DNN

KW - Levinson-Durbin recursion

KW - generalized analysis-by-synthesis

UR - http://www.scopus.com/inward/record.url?scp=85089227463&partnerID=8YFLogxK

U2 - 10.1109/ICASSP40776.2020.9053755

DO - 10.1109/ICASSP40776.2020.9053755

M3 - Article in proceeding

SN - 978-1-5090-6632-2

T3 - Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing

SP - 6759

EP - 6763

BT - Proceedings of the International Conference on Acousics, Speech, and Signal Processing

PB - IEEE

T2 - ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Y2 - 4 May 2020 through 8 May 2020

ER -

Autoregressive Parameter Estimation with DNN-based Pre-processing

Abstract

Conference

Keywords

Access to Document

AUB Link

Other files and links

Fingerprint

Cite this