A MAP Criterion for Detecting the Number of Speakers at frame level in Model-based Single-Channel Speech Separation

Pejman Mowlaee; Mads Græsbøll Christensen; Zheng-Hua Tan; Søren Holdt Jensen

doi:10.1109/ACSSC.2010.5757617

A MAP Criterion for Detecting the Number of Speakers at frame level in Model-based Single-Channel Speech Separation

Pejman Mowlaee, Mads Græsbøll Christensen, Zheng-Hua Tan, Søren Holdt Jensen

Publikation: Bidrag til tidsskrift › Konferenceartikel i tidsskrift › Forskning › peer review

7 Citationer (Scopus)

341 Downloads (Pure)

Abstract

The problem of detecting the number of speakers for a particular segment occurs in many dif-
ferent speech applications. In single channel speech separation, for example, this information is
often used to simplify the separation process, as the signal has to be treated differently depending
on the number of speakers. Inspired by the asymptotic maximum a posteriori rule proposed for
model selection, we pose the problem as a model selection problem. More speciﬁcally, we derive
a multiple hypotheses test for determining the number of speakers at a frame level in an observed
signal based on underlying parametric speaker models, trained a priori. The experimental results
indicate that the suggested method improves the quality of the separated signals in a single-channel
speech separation scenario at different signal-to-signal ratio levels.

Originalsprog	Engelsk
Tidsskrift	Asilomar Conference on Signals, Systems and Computers. Conference Record
Sider (fra-til)	538 - 541
ISSN	1058-6393
DOI	https://doi.org/10.1109/ACSSC.2010.5757617
Status	Udgivet - 2010
Begivenhed	44th Asilomar Conference on Signals, Systems and Computers - Pacific Grove, USA Varighed: 7 nov. 2010 → 10 nov. 2010

Konference

Konference	44th Asilomar Conference on Signals, Systems and Computers
Land/Område	USA
By	Pacific Grove
Periode	07/11/2010 → 10/11/2010

Adgang til dokumentet

10.1109/ACSSC.2010.5757617

Asilomar2010aAccepteret manuskript, 232 KB

http://imi.aau.dk/~mgc/publications/asilomar2010a.pdf

AUB Link

Søg efter materialet i Aalborg Universitetsbiblioteks søgemaskine

Citationsformater

@inproceedings{a3ad4a12ce8349fc940df6e8e7dd68a3,

title = "A MAP Criterion for Detecting the Number of Speakers at frame level in Model-based Single-Channel Speech Separation",

abstract = "The problem of detecting the number of speakers for a particular segment occurs in many dif-ferent speech applications. In single channel speech separation, for example, this information isoften used to simplify the separation process, as the signal has to be treated differently dependingon the number of speakers. Inspired by the asymptotic maximum a posteriori rule proposed formodel selection, we pose the problem as a model selection problem. More speciﬁcally, we derivea multiple hypotheses test for determining the number of speakers at a frame level in an observedsignal based on underlying parametric speaker models, trained a priori. The experimental resultsindicate that the suggested method improves the quality of the separated signals in a single-channelspeech separation scenario at different signal-to-signal ratio levels.",

author = "Pejman Mowlaee and Christensen, {Mads Gr{\ae}sb{\o}ll} and Zheng-Hua Tan and Jensen, {S{\o}ren Holdt}",

year = "2010",

doi = "10.1109/ACSSC.2010.5757617",

language = "English",

pages = "538 -- 541",

journal = "Asilomar Conference on Signals, Systems and Computers. Conference Record",

issn = "1058-6393",

publisher = "I E E E Computer Society",

note = "44th Asilomar Conference on Signals, Systems and Computers ; Conference date: 07-11-2010 Through 10-11-2010",

}

A MAP Criterion for Detecting the Number of Speakers at frame level in Model-based Single-Channel Speech Separation. / Mowlaee, Pejman; Christensen, Mads Græsbøll ; Tan, Zheng-Hua et al.
I: Asilomar Conference on Signals, Systems and Computers. Conference Record, 2010, s. 538 - 541.

Publikation: Bidrag til tidsskrift › Konferenceartikel i tidsskrift › Forskning › peer review

TY - GEN

T1 - A MAP Criterion for Detecting the Number of Speakers at frame level in Model-based Single-Channel Speech Separation

AU - Mowlaee, Pejman

AU - Christensen, Mads Græsbøll

AU - Tan, Zheng-Hua

AU - Jensen, Søren Holdt

PY - 2010

Y1 - 2010

N2 - The problem of detecting the number of speakers for a particular segment occurs in many dif-ferent speech applications. In single channel speech separation, for example, this information isoften used to simplify the separation process, as the signal has to be treated differently dependingon the number of speakers. Inspired by the asymptotic maximum a posteriori rule proposed formodel selection, we pose the problem as a model selection problem. More speciﬁcally, we derivea multiple hypotheses test for determining the number of speakers at a frame level in an observedsignal based on underlying parametric speaker models, trained a priori. The experimental resultsindicate that the suggested method improves the quality of the separated signals in a single-channelspeech separation scenario at different signal-to-signal ratio levels.

AB - The problem of detecting the number of speakers for a particular segment occurs in many dif-ferent speech applications. In single channel speech separation, for example, this information isoften used to simplify the separation process, as the signal has to be treated differently dependingon the number of speakers. Inspired by the asymptotic maximum a posteriori rule proposed formodel selection, we pose the problem as a model selection problem. More speciﬁcally, we derivea multiple hypotheses test for determining the number of speakers at a frame level in an observedsignal based on underlying parametric speaker models, trained a priori. The experimental resultsindicate that the suggested method improves the quality of the separated signals in a single-channelspeech separation scenario at different signal-to-signal ratio levels.

U2 - 10.1109/ACSSC.2010.5757617

DO - 10.1109/ACSSC.2010.5757617

M3 - Conference article in Journal

SN - 1058-6393

SP - 538

EP - 541

JO - Asilomar Conference on Signals, Systems and Computers. Conference Record

JF - Asilomar Conference on Signals, Systems and Computers. Conference Record

T2 - 44th Asilomar Conference on Signals, Systems and Computers

Y2 - 7 November 2010 through 10 November 2010

ER -

A MAP Criterion for Detecting the Number of Speakers at frame level in Model-based Single-Channel Speech Separation

Abstract

Konference

Adgang til dokumentet

AUB Link

Fingeraftryk

Citationsformater