A Novel Approach to Speaker Weight Estimation Using a Fusion of the i-vector and NFA Frameworks

Amir Hossein Poorjam; Mohamad Hasan Bahari; Hogo Van hamme

doi:10.22067/ess.v3i1.52091

A Novel Approach to Speaker Weight Estimation Using a Fusion of the i-vector and NFA Frameworks

Amir Hossein Poorjam, Mohamad Hasan Bahari, Hogo Van hamme

Publikation: Bidrag til tidsskrift › Tidsskriftartikel › Forskning › peer review

175 Downloads (Pure)

Abstract

This paper proposes a novel approach for automatic speaker weight estimation from spontaneous telephone speech signals. In this method, each utterance is modeled using the i-vector framework which is based on the factor analysis on Gaussian Mixture Model (GMM) mean supervectors, and the Non-negative Factor Analysis (NFA) framework which is based on a constrained factor analysis on GMM weight supervectors. Then, the available information in both Gaussian means and Gaussian weights is exploited through a feature-level fusion of the i-vectors and the NFA vectors. Finally, a least-squares support vector regression is employed to estimate the weight of speakers from the given utterances.
The proposed approach is evaluated on spontaneous telephone speech signals of National Institute of Standards and Technology 2008 and 2010 Speaker Recognition Evaluation corpora. To investigate the effectiveness of the proposed approach, this method is compared to the i-vector-based speaker weight estimation and an alternative fusion scheme, namely the score-level fusion. Experimental results over 2339 utterances show that the correlation coefficients between the actual and the estimated weights of female and male speakers are 0.49 and 0.56, respectively, which indicate the effectiveness of the proposed method in speaker weight estimation.

Originalsprog	Engelsk
Tidsskrift	Journal of Electrical Systems and Signals
Vol/bind	3
Udgave nummer	1
Sider (fra-til)	47-55
Antal sider	8
ISSN	2322-5483
DOI	https://doi.org/10.22067/ess.v3i1.52091
Status	Udgivet - feb. 2017

Adgang til dokumentet

10.22067/ess.v3i1.52091

A_Novel_Approach_to_Speaker_Weight_EstimAccepteret manuskript, 355 KB

AUB Link

Søg efter materialet i Aalborg Universitetsbiblioteks søgemaskine

Citationsformater

@article{c822f36c93b44ff19d1bdeffa67dc4c9,

title = "A Novel Approach to Speaker Weight Estimation Using a Fusion of the i-vector and NFA Frameworks",

abstract = "This paper proposes a novel approach for automatic speaker weight estimation from spontaneous telephone speech signals. In this method, each utterance is modeled using the i-vector framework which is based on the factor analysis on Gaussian Mixture Model (GMM) mean supervectors, and the Non-negative Factor Analysis (NFA) framework which is based on a constrained factor analysis on GMM weight supervectors. Then, the available information in both Gaussian means and Gaussian weights is exploited through a feature-level fusion of the i-vectors and the NFA vectors. Finally, a least-squares support vector regression is employed to estimate the weight of speakers from the given utterances. The proposed approach is evaluated on spontaneous telephone speech signals of National Institute of Standards and Technology 2008 and 2010 Speaker Recognition Evaluation corpora. To investigate the effectiveness of the proposed approach, this method is compared to the i-vector-based speaker weight estimation and an alternative fusion scheme, namely the score-level fusion. Experimental results over 2339 utterances show that the correlation coefficients between the actual and the estimated weights of female and male speakers are 0.49 and 0.56, respectively, which indicate the effectiveness of the proposed method in speaker weight estimation.",

keywords = "i-vector, Non-negative Factor Analysis, Speaker Body Weight Estimation, Least-squares Support Vector Regression, MFCC",

author = "Poorjam, {Amir Hossein} and Bahari, {Mohamad Hasan} and {Van hamme}, Hogo",

year = "2017",

month = feb,

doi = "10.22067/ess.v3i1.52091",

language = "English",

volume = "3",

pages = "47--55",

journal = "Journal of Electrical Systems and Signals",

issn = "2322-5483",

number = "1",

}

TY - JOUR

T1 - A Novel Approach to Speaker Weight Estimation Using a Fusion of the i-vector and NFA Frameworks

AU - Poorjam, Amir Hossein

AU - Bahari, Mohamad Hasan

AU - Van hamme, Hogo

PY - 2017/2

Y1 - 2017/2

N2 - This paper proposes a novel approach for automatic speaker weight estimation from spontaneous telephone speech signals. In this method, each utterance is modeled using the i-vector framework which is based on the factor analysis on Gaussian Mixture Model (GMM) mean supervectors, and the Non-negative Factor Analysis (NFA) framework which is based on a constrained factor analysis on GMM weight supervectors. Then, the available information in both Gaussian means and Gaussian weights is exploited through a feature-level fusion of the i-vectors and the NFA vectors. Finally, a least-squares support vector regression is employed to estimate the weight of speakers from the given utterances. The proposed approach is evaluated on spontaneous telephone speech signals of National Institute of Standards and Technology 2008 and 2010 Speaker Recognition Evaluation corpora. To investigate the effectiveness of the proposed approach, this method is compared to the i-vector-based speaker weight estimation and an alternative fusion scheme, namely the score-level fusion. Experimental results over 2339 utterances show that the correlation coefficients between the actual and the estimated weights of female and male speakers are 0.49 and 0.56, respectively, which indicate the effectiveness of the proposed method in speaker weight estimation.

AB - This paper proposes a novel approach for automatic speaker weight estimation from spontaneous telephone speech signals. In this method, each utterance is modeled using the i-vector framework which is based on the factor analysis on Gaussian Mixture Model (GMM) mean supervectors, and the Non-negative Factor Analysis (NFA) framework which is based on a constrained factor analysis on GMM weight supervectors. Then, the available information in both Gaussian means and Gaussian weights is exploited through a feature-level fusion of the i-vectors and the NFA vectors. Finally, a least-squares support vector regression is employed to estimate the weight of speakers from the given utterances. The proposed approach is evaluated on spontaneous telephone speech signals of National Institute of Standards and Technology 2008 and 2010 Speaker Recognition Evaluation corpora. To investigate the effectiveness of the proposed approach, this method is compared to the i-vector-based speaker weight estimation and an alternative fusion scheme, namely the score-level fusion. Experimental results over 2339 utterances show that the correlation coefficients between the actual and the estimated weights of female and male speakers are 0.49 and 0.56, respectively, which indicate the effectiveness of the proposed method in speaker weight estimation.

KW - i-vector

KW - Non-negative Factor Analysis

KW - Speaker Body Weight Estimation

KW - Least-squares Support Vector Regression

KW - MFCC

U2 - 10.22067/ess.v3i1.52091

DO - 10.22067/ess.v3i1.52091

M3 - Journal article

SN - 2322-5483

VL - 3

SP - 47

EP - 55

JO - Journal of Electrical Systems and Signals

JF - Journal of Electrical Systems and Signals

IS - 1

ER -

A Novel Approach to Speaker Weight Estimation Using a Fusion of the i-vector and NFA Frameworks

Abstract

Adgang til dokumentet

AUB Link

Fingeraftryk

Citationsformater