A MAP Criterion for Detecting the Number of Speakers at frame level in Model-based Single-Channel Speech Separation

Research output: Contribution to journalConference article in JournalResearchpeer-review

7 Citations (Scopus)
194 Downloads (Pure)

Abstract

The problem of detecting the number of speakers for a particular segment occurs in many dif-
ferent speech applications. In single channel speech separation, for example, this information is
often used to simplify the separation process, as the signal has to be treated differently depending
on the number of speakers. Inspired by the asymptotic maximum a posteriori rule proposed for
model selection, we pose the problem as a model selection problem. More specifically, we derive
a multiple hypotheses test for determining the number of speakers at a frame level in an observed
signal based on underlying parametric speaker models, trained a priori. The experimental results
indicate that the suggested method improves the quality of the separated signals in a single-channel
speech separation scenario at different signal-to-signal ratio levels.
Original languageEnglish
JournalAsilomar Conference on Signals, Systems and Computers. Conference Record
Pages (from-to)538 - 541
ISSN1058-6393
DOIs
Publication statusPublished - 2010
Event44th Asilomar Conference on Signals, Systems and Computers - Pacific Grove, United States
Duration: 7 Nov 201010 Nov 2010

Conference

Conference44th Asilomar Conference on Signals, Systems and Computers
CountryUnited States
CityPacific Grove
Period07/11/201010/11/2010

Cite this

@inproceedings{a3ad4a12ce8349fc940df6e8e7dd68a3,
title = "A MAP Criterion for Detecting the Number of Speakers at frame level in Model-based Single-Channel Speech Separation",
abstract = "The problem of detecting the number of speakers for a particular segment occurs in many dif-ferent speech applications. In single channel speech separation, for example, this information isoften used to simplify the separation process, as the signal has to be treated differently dependingon the number of speakers. Inspired by the asymptotic maximum a posteriori rule proposed formodel selection, we pose the problem as a model selection problem. More specifically, we derivea multiple hypotheses test for determining the number of speakers at a frame level in an observedsignal based on underlying parametric speaker models, trained a priori. The experimental resultsindicate that the suggested method improves the quality of the separated signals in a single-channelspeech separation scenario at different signal-to-signal ratio levels.",
author = "Pejman Mowlaee and Christensen, {Mads Gr{\ae}sb{\o}ll} and Zheng-Hua Tan and Jensen, {S{\o}ren Holdt}",
year = "2010",
doi = "10.1109/ACSSC.2010.5757617",
language = "English",
pages = "538 -- 541",
journal = "Asilomar Conference on Signals, Systems and Computers. Conference Record",
issn = "1058-6393",
publisher = "I E E E Computer Society",

}

TY - GEN

T1 - A MAP Criterion for Detecting the Number of Speakers at frame level in Model-based Single-Channel Speech Separation

AU - Mowlaee, Pejman

AU - Christensen, Mads Græsbøll

AU - Tan, Zheng-Hua

AU - Jensen, Søren Holdt

PY - 2010

Y1 - 2010

N2 - The problem of detecting the number of speakers for a particular segment occurs in many dif-ferent speech applications. In single channel speech separation, for example, this information isoften used to simplify the separation process, as the signal has to be treated differently dependingon the number of speakers. Inspired by the asymptotic maximum a posteriori rule proposed formodel selection, we pose the problem as a model selection problem. More specifically, we derivea multiple hypotheses test for determining the number of speakers at a frame level in an observedsignal based on underlying parametric speaker models, trained a priori. The experimental resultsindicate that the suggested method improves the quality of the separated signals in a single-channelspeech separation scenario at different signal-to-signal ratio levels.

AB - The problem of detecting the number of speakers for a particular segment occurs in many dif-ferent speech applications. In single channel speech separation, for example, this information isoften used to simplify the separation process, as the signal has to be treated differently dependingon the number of speakers. Inspired by the asymptotic maximum a posteriori rule proposed formodel selection, we pose the problem as a model selection problem. More specifically, we derivea multiple hypotheses test for determining the number of speakers at a frame level in an observedsignal based on underlying parametric speaker models, trained a priori. The experimental resultsindicate that the suggested method improves the quality of the separated signals in a single-channelspeech separation scenario at different signal-to-signal ratio levels.

U2 - 10.1109/ACSSC.2010.5757617

DO - 10.1109/ACSSC.2010.5757617

M3 - Conference article in Journal

SP - 538

EP - 541

JO - Asilomar Conference on Signals, Systems and Computers. Conference Record

JF - Asilomar Conference on Signals, Systems and Computers. Conference Record

SN - 1058-6393

ER -