Sparsity in Linear Predictive Coding of Speech

Daniele Giacobello

Research output: PhD thesis

3239 Downloads (Pure)

Abstract

This thesis deals with developing improved techniques for speech coding based
on the recent developments in sparse signal representation. In particular, this
work is motivated by the need to address some of the limitations of the well-
known linear prediction (LP) model currently applied in many modern speech
coders.
In the first part of the thesis, we provide an overview of Sparse Linear Predic-
tion, a set of speech processing tools created by introducing sparsity constraints
into the LP framework. This approach defines predictors that look for a sparse
residual rather than a minimum variance one with direct applications to coding
but also consistent with the speech production model of voiced speech, where
the excitation of the all-pole filter can be modeled as an impulse train, i.e., a
sparse sequence. Introducing sparsity in the LP framework will also bring to de-
velop the concept of high-order sparse predictors. These predictors, by modeling
efficiently the spectral envelope and the harmonics components with very few
coefficients, have direct applications in speech processing, engendering a joint
estimation of short-term and long-term predictors. We also give preliminary
results of the effectiveness of their application in audio processing.
The second part of the thesis deals with introducing sparsity directly in
the linear prediction analysis-by-synthesis (LPAS) speech coding paradigm. We
first propose a novel near-optimal method to look for a sparse approximate
excitation using a compressed sensing formulation. Furthermore, we define a
novel re-estimation procedure to adapt the predictor coefficients to the given
sparse excitation, balancing the two representations in the context of speech
coding. Finally, the advantages of the compact parametric representation of a
segment of speech, given by the sparse linear predictors and the use of the re-
estimation procedure, are analyzed in the context of frame independent coding
for speech communications over packet networks.
Original languageEnglish
Place of PublicationAalborg
Publisher
Print ISBNs978-87-9232837-3
Publication statusPublished - 17 Sept 2010

Fingerprint

Dive into the research topics of 'Sparsity in Linear Predictive Coding of Speech'. Together they form a unique fingerprint.

Cite this