Improving Monaural Speaker Identification by Double-Talk Detection

Rahim  Saeidi; Pejman Mowlaee; Tomi  Kinnunen; Zheng-Hua Tan; Mads Græsbøll Christensen; Søren Holdt Jensen; Pasi  Fränti

Improving Monaural Speaker Identification by Double-Talk Detection

Rahim Saeidi, Pejman Mowlaee, Tomi Kinnunen, Zheng-Hua Tan, Mads Græsbøll Christensen, Søren Holdt Jensen, Pasi Fränti

Research output: Contribution to journal › Conference article in Journal › Research › peer-review

2 Citations (Scopus)

245 Downloads (Pure)

Abstract

This paper describes a novel approach to improve monoaural
speaker identification where two speakers are present in a
single-microphone recording. The goal is to identify both of
the underlying speakers in the given mixture. The proposed
approach is composed of a double-talk detector (DTD) as a preprocessor
and speaker identification back-end. We demonstrate
that including the double-talk detector improves the speaker
identification accuracy. Experiments on GRID corpus show that
including the DTD improves average recognition accuracy from
96.53% to 97.43%.

Original language	English
Journal	Proceedings of the International Conference on Spoken Language Processing
Pages (from-to)	1069-1072
ISSN	1990-9772
Publication status	Published - 26 Sept 2010
Event	Interspeech 2010 - Makuhari, Japan Duration: 26 Sept 2010 → 30 Sept 2010

Conference

Conference	Interspeech 2010
Country/Territory	Japan
City	Makuhari
Period	26/09/2010 → 30/09/2010

Access to Document

Interspeech2010Accepted author manuscript, 332 KB

http://cs.joensuu.fi/pages/saeidi/Interspeech2010_1.pdf

AUB Link

Search for the material in Aalborg University Library's search engine

Cite this

@inproceedings{9d30e627a4d84026a58386e5c3046d47,

title = "Improving Monaural Speaker Identification by Double-Talk Detection",

abstract = "This paper describes a novel approach to improve monoauralspeaker identification where two speakers are present in asingle-microphone recording. The goal is to identify both ofthe underlying speakers in the given mixture. The proposedapproach is composed of a double-talk detector (DTD) as a preprocessorand speaker identification back-end. We demonstratethat including the double-talk detector improves the speakeridentification accuracy. Experiments on GRID corpus show thatincluding the DTD improves average recognition accuracy from96.53% to 97.43%.",

author = "Rahim Saeidi and Pejman Mowlaee and Tomi Kinnunen and Zheng-Hua Tan and Christensen, {Mads Gr{\ae}sb{\o}ll} and Jensen, {S{\o}ren Holdt} and Pasi Fr{\"a}nti",

year = "2010",

month = sep,

day = "26",

language = "English",

pages = "1069--1072",

journal = "Proceedings of the International Conference on Spoken Language Processing",

issn = "1990-9772",

publisher = "International Speech Communication Association",

note = "Interspeech 2010 ; Conference date: 26-09-2010 Through 30-09-2010",

}

TY - GEN

T1 - Improving Monaural Speaker Identification by Double-Talk Detection

AU - Saeidi, Rahim

AU - Mowlaee, Pejman

AU - Kinnunen, Tomi

AU - Tan, Zheng-Hua

AU - Christensen, Mads Græsbøll

AU - Jensen, Søren Holdt

AU - Fränti, Pasi

PY - 2010/9/26

Y1 - 2010/9/26

N2 - This paper describes a novel approach to improve monoauralspeaker identification where two speakers are present in asingle-microphone recording. The goal is to identify both ofthe underlying speakers in the given mixture. The proposedapproach is composed of a double-talk detector (DTD) as a preprocessorand speaker identification back-end. We demonstratethat including the double-talk detector improves the speakeridentification accuracy. Experiments on GRID corpus show thatincluding the DTD improves average recognition accuracy from96.53% to 97.43%.

AB - This paper describes a novel approach to improve monoauralspeaker identification where two speakers are present in asingle-microphone recording. The goal is to identify both ofthe underlying speakers in the given mixture. The proposedapproach is composed of a double-talk detector (DTD) as a preprocessorand speaker identification back-end. We demonstratethat including the double-talk detector improves the speakeridentification accuracy. Experiments on GRID corpus show thatincluding the DTD improves average recognition accuracy from96.53% to 97.43%.

M3 - Conference article in Journal

SN - 1990-9772

SP - 1069

EP - 1072

JO - Proceedings of the International Conference on Spoken Language Processing

JF - Proceedings of the International Conference on Spoken Language Processing

T2 - Interspeech 2010

Y2 - 26 September 2010 through 30 September 2010

ER -

Improving Monaural Speaker Identification by Double-Talk Detection

Abstract

Conference

Access to Document

AUB Link

Fingerprint

Cite this