Least 1-Norm Pole-Zero Modeling with Sparse Deconvolution for Speech Analysis

Publikation: Bidrag til bog/antologi/rapport/konference proceedingKonferenceartikel i proceedingForskningpeer review

3 Citationer (Scopus)
83 Downloads (Pure)

Resumé

In this paper, we present a speech analysis method based on sparse pole-zero modeling of speech. Instead of using the all-pole model to approximate the speech production filter, a pole-zero model is used for the combined effect of the vocal tract; radiation at the lips and the glottal pulse shape. Moreover, to consider the spiky excitation form of the pulse train during voiced speech, the modeling parame- ters and sparse residuals are estimated in an iterative fashion using a least 1-norm pole-zero with sparse deconvolution algorithm. Com- pared with the conventional two-stage least squares pole-zero, linear prediction and sparse linear prediction methods, experimental results show that the proposed speech analysis method has lower spectral distortion, higher reconstruction SNR and sparser residuals.
OriginalsprogEngelsk
TitelIEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2017
Antal sider5
ForlagIEEE
Publikationsdato19 jun. 2017
Sider731-735
ISBN (Elektronisk)978-1-5090-4117-6
DOI
StatusUdgivet - 19 jun. 2017
BegivenhedThe 42nd IEEE International Conference on Acoustics, Speech and Signal Processing: The Internet of Signals - New Orleans, USA
Varighed: 5 mar. 20179 mar. 2017
http://www.ieee-icassp2017.org/
http://www.ieee-icassp2017.org/

Konference

KonferenceThe 42nd IEEE International Conference on Acoustics, Speech and Signal Processing
LandUSA
ByNew Orleans
Periode05/03/201709/03/2017
Internetadresse
NavnI E E E International Conference on Acoustics, Speech and Signal Processing. Proceedings
ISSN1520-6149

Fingerprint

Speech analysis
Deconvolution
Poles
Radiation

Citer dette

Shi, L., Jensen, J. R., & Christensen, M. G. (2017). Least 1-Norm Pole-Zero Modeling with Sparse Deconvolution for Speech Analysis. I IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2017 (s. 731-735). IEEE. I E E E International Conference on Acoustics, Speech and Signal Processing. Proceedings https://doi.org/10.1109/ICASSP.2017.7952252
Shi, Liming ; Jensen, Jesper Rindom ; Christensen, Mads Græsbøll. / Least 1-Norm Pole-Zero Modeling with Sparse Deconvolution for Speech Analysis. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2017. IEEE, 2017. s. 731-735 (I E E E International Conference on Acoustics, Speech and Signal Processing. Proceedings).
@inproceedings{538fb7a44f354528a0964715be16242e,
title = "Least 1-Norm Pole-Zero Modeling with Sparse Deconvolution for Speech Analysis",
abstract = "In this paper, we present a speech analysis method based on sparse pole-zero modeling of speech. Instead of using the all-pole model to approximate the speech production filter, a pole-zero model is used for the combined effect of the vocal tract; radiation at the lips and the glottal pulse shape. Moreover, to consider the spiky excitation form of the pulse train during voiced speech, the modeling parame- ters and sparse residuals are estimated in an iterative fashion using a least 1-norm pole-zero with sparse deconvolution algorithm. Com- pared with the conventional two-stage least squares pole-zero, linear prediction and sparse linear prediction methods, experimental results show that the proposed speech analysis method has lower spectral distortion, higher reconstruction SNR and sparser residuals.",
keywords = "Pole-zero model, least 1-norm cost function , sparse deconvolution, speech analysis",
author = "Liming Shi and Jensen, {Jesper Rindom} and Christensen, {Mads Gr{\ae}sb{\o}ll}",
year = "2017",
month = "6",
day = "19",
doi = "10.1109/ICASSP.2017.7952252",
language = "English",
pages = "731--735",
booktitle = "IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2017",
publisher = "IEEE",
address = "United States",

}

Shi, L, Jensen, JR & Christensen, MG 2017, Least 1-Norm Pole-Zero Modeling with Sparse Deconvolution for Speech Analysis. i IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2017. IEEE, I E E E International Conference on Acoustics, Speech and Signal Processing. Proceedings, s. 731-735, New Orleans, USA, 05/03/2017. https://doi.org/10.1109/ICASSP.2017.7952252

Least 1-Norm Pole-Zero Modeling with Sparse Deconvolution for Speech Analysis. / Shi, Liming; Jensen, Jesper Rindom; Christensen, Mads Græsbøll.

IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2017. IEEE, 2017. s. 731-735 (I E E E International Conference on Acoustics, Speech and Signal Processing. Proceedings).

Publikation: Bidrag til bog/antologi/rapport/konference proceedingKonferenceartikel i proceedingForskningpeer review

TY - GEN

T1 - Least 1-Norm Pole-Zero Modeling with Sparse Deconvolution for Speech Analysis

AU - Shi, Liming

AU - Jensen, Jesper Rindom

AU - Christensen, Mads Græsbøll

PY - 2017/6/19

Y1 - 2017/6/19

N2 - In this paper, we present a speech analysis method based on sparse pole-zero modeling of speech. Instead of using the all-pole model to approximate the speech production filter, a pole-zero model is used for the combined effect of the vocal tract; radiation at the lips and the glottal pulse shape. Moreover, to consider the spiky excitation form of the pulse train during voiced speech, the modeling parame- ters and sparse residuals are estimated in an iterative fashion using a least 1-norm pole-zero with sparse deconvolution algorithm. Com- pared with the conventional two-stage least squares pole-zero, linear prediction and sparse linear prediction methods, experimental results show that the proposed speech analysis method has lower spectral distortion, higher reconstruction SNR and sparser residuals.

AB - In this paper, we present a speech analysis method based on sparse pole-zero modeling of speech. Instead of using the all-pole model to approximate the speech production filter, a pole-zero model is used for the combined effect of the vocal tract; radiation at the lips and the glottal pulse shape. Moreover, to consider the spiky excitation form of the pulse train during voiced speech, the modeling parame- ters and sparse residuals are estimated in an iterative fashion using a least 1-norm pole-zero with sparse deconvolution algorithm. Com- pared with the conventional two-stage least squares pole-zero, linear prediction and sparse linear prediction methods, experimental results show that the proposed speech analysis method has lower spectral distortion, higher reconstruction SNR and sparser residuals.

KW - Pole-zero model

KW - least 1-norm cost function

KW - sparse deconvolution

KW - speech analysis

U2 - 10.1109/ICASSP.2017.7952252

DO - 10.1109/ICASSP.2017.7952252

M3 - Article in proceeding

SP - 731

EP - 735

BT - IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2017

PB - IEEE

ER -

Shi L, Jensen JR, Christensen MG. Least 1-Norm Pole-Zero Modeling with Sparse Deconvolution for Speech Analysis. I IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2017. IEEE. 2017. s. 731-735. (I E E E International Conference on Acoustics, Speech and Signal Processing. Proceedings). https://doi.org/10.1109/ICASSP.2017.7952252