Metrics for vector quantization-based parametric speech enhancement and separation

Mads Græsbøll Christensen

doi:10.1121/1.4799004

Metrics for vector quantization-based parametric speech enhancement and separation

Mads Græsbøll Christensen

Institut for Arkitektur og Medieteknologi

Publikation: Bidrag til tidsskrift › Tidsskriftartikel › Forskning › peer review

5 Citationer (Scopus)

550 Downloads (Pure)

Abstract

Speech enhancement and separation algorithms sometimes employ a two-stage processing scheme, wherein the signal is first mapped to an intermediate low-dimensional parametric description after which the parameters are mapped to vectors in codebooks trained on, for example, individual noise-free sources using a vector quantizer. To obtain accurate parameters, one must employ a
good estimator in finding the parameters of the intermediate representation, like a maximum likelihood estimator. This leaves some unanswered questions, however, like what metrics to use in the subsequent vector quantization process and how to systematically derive them. This paper aims at answering these questions. Metrics for this are presented and derived, and their use is exemplified on a number of different signal models by deriving closed-form expressions. The metrics essentially take into account in the vector quantization process that some parameters may have been estimated more accurately than others and that there may be dependencies between the estimation errors.

Originalsprog	Engelsk
Tidsskrift	The Journal of the Acoustical Society of America
Vol/bind	133
Udgave nummer	5
Sider (fra-til)	3062-3071
Antal sider	10
ISSN	0001-4966
DOI	https://doi.org/10.1121/1.4799004
Status	Udgivet - 2013

Adgang til dokumentet

10.1121/1.4799004

paperIndsendt manuskript, 381 KB

http://scitation.aip.org/content/asa/journal/jasa/133/5/10.1121/1.4799004

AUB Link

Søg efter materialet i Aalborg Universitetsbiblioteks søgemaskine

Citationsformater

@article{e0babfb62193405bbeb21f7ae5d50757,

title = "Metrics for vector quantization-based parametric speech enhancement and separation",

abstract = "Speech enhancement and separation algorithms sometimes employ a two-stage processing scheme, wherein the signal is first mapped to an intermediate low-dimensional parametric description after which the parameters are mapped to vectors in codebooks trained on, for exam- ple, individual noise-free sources using a vector quantizer. To obtain accurate parameters, one must employ a good estimator in finding the parameters of the intermediate representation, like a maximum like- lihood estimator. This leaves some unanswered questions, however, like what metrics to use in the subsequent vector quantization process and how to systematically derive them. This paper aims at answering these questions. Metrics for this are presented and derived, and their use is exemplified on a number of different signal models by deriving closed-form expressions. The metrics essentially take into account in the vector quantization process that some parameters may have been estimated more accurately than others and that there may be depen- dencies between the estimation errors.",

author = "Christensen, {Mads Gr{\ae}sb{\o}ll}",

year = "2013",

doi = "10.1121/1.4799004",

language = "English",

volume = "133",

pages = "3062--3071",

journal = "The Journal of the Acoustical Society of America",

issn = "0001-4966",

publisher = "A I P Publishing LLC",

number = "5",

}

TY - JOUR

T1 - Metrics for vector quantization-based parametric speech enhancement and separation

AU - Christensen, Mads Græsbøll

PY - 2013

Y1 - 2013

N2 - Speech enhancement and separation algorithms sometimes employ a two-stage processing scheme, wherein the signal is first mapped to an intermediate low-dimensional parametric description after which the parameters are mapped to vectors in codebooks trained on, for exam- ple, individual noise-free sources using a vector quantizer. To obtain accurate parameters, one must employ a good estimator in finding the parameters of the intermediate representation, like a maximum like- lihood estimator. This leaves some unanswered questions, however, like what metrics to use in the subsequent vector quantization process and how to systematically derive them. This paper aims at answering these questions. Metrics for this are presented and derived, and their use is exemplified on a number of different signal models by deriving closed-form expressions. The metrics essentially take into account in the vector quantization process that some parameters may have been estimated more accurately than others and that there may be depen- dencies between the estimation errors.

AB - Speech enhancement and separation algorithms sometimes employ a two-stage processing scheme, wherein the signal is first mapped to an intermediate low-dimensional parametric description after which the parameters are mapped to vectors in codebooks trained on, for exam- ple, individual noise-free sources using a vector quantizer. To obtain accurate parameters, one must employ a good estimator in finding the parameters of the intermediate representation, like a maximum like- lihood estimator. This leaves some unanswered questions, however, like what metrics to use in the subsequent vector quantization process and how to systematically derive them. This paper aims at answering these questions. Metrics for this are presented and derived, and their use is exemplified on a number of different signal models by deriving closed-form expressions. The metrics essentially take into account in the vector quantization process that some parameters may have been estimated more accurately than others and that there may be depen- dencies between the estimation errors.

U2 - 10.1121/1.4799004

DO - 10.1121/1.4799004

M3 - Journal article

SN - 0001-4966

VL - 133

SP - 3062

EP - 3071

JO - The Journal of the Acoustical Society of America

JF - The Journal of the Acoustical Society of America

IS - 5

ER -

Metrics for vector quantization-based parametric speech enhancement and separation

Abstract

Adgang til dokumentet

AUB Link

Fingeraftryk

Citationsformater