A Novel Approach to Speaker Weight Estimation Using a Fusion of the i-vector and NFA Frameworks

Amir Hossein Poorjam, Mohamad Hasan Bahari, Hogo Van hamme

Research output: Contribution to journalJournal articleResearchpeer-review

179 Downloads (Pure)

Abstract

This paper proposes a novel approach for automatic speaker weight estimation from spontaneous telephone speech signals. In this method, each utterance is modeled using the i-vector framework which is based on the factor analysis on Gaussian Mixture Model (GMM) mean supervectors, and the Non-negative Factor Analysis (NFA) framework which is based on a constrained factor analysis on GMM weight supervectors. Then, the available information in both Gaussian means and Gaussian weights is exploited through a feature-level fusion of the i-vectors and the NFA vectors. Finally, a least-squares support vector regression is employed to estimate the weight of speakers from the given utterances.
The proposed approach is evaluated on spontaneous telephone speech signals of National Institute of Standards and Technology 2008 and 2010 Speaker Recognition Evaluation corpora. To investigate the effectiveness of the proposed approach, this method is compared to the i-vector-based speaker weight estimation and an alternative fusion scheme, namely the score-level fusion. Experimental results over 2339 utterances show that the correlation coefficients between the actual and the estimated weights of female and male speakers are 0.49 and 0.56, respectively, which indicate the effectiveness of the proposed method in speaker weight estimation.
Original languageEnglish
JournalJournal of Electrical Systems and Signals
Volume3
Issue number1
Pages (from-to)47-55
Number of pages8
ISSN2322-5483
DOIs
Publication statusPublished - Feb 2017

Keywords

  • i-vector
  • Non-negative Factor Analysis
  • Speaker Body Weight Estimation
  • Least-squares Support Vector Regression
  • MFCC

Fingerprint

Dive into the research topics of 'A Novel Approach to Speaker Weight Estimation Using a Fusion of the i-vector and NFA Frameworks'. Together they form a unique fingerprint.

Cite this