Improved single-channel speech separation using sinusoidal modeling

Publikation: Bidrag til tidsskriftKonferenceartikel i tidsskriftForskningpeer review

14 Citationer (Scopus)
210 Downloads (Pure)

Resumé

We present a novel single-channel separation approach to improve the separation performance while recovering the signals from a mixture. The key idea in this research is to employ a mixture estimator based on unconstrained modified sinusoidal parameters. Compared to the mixmax (binary mask) and Wiener filter (softmask) approaches, the proposed approach works independently of pitch estimates. Furthermore, it is observed that it can achieve acceptable perceptual speech quality with less cross-talk at different signal-tosignal ratios while bringing down the complexity by replacing STFT with sinusoidal parameters. Improvementsmade by the proposed approach are demonstrated by employing PESQ as our objective measureand MUSHRA listening test as our subjective evaluation.
OriginalsprogEngelsk
TidsskriftI E E E International Conference on Acoustics, Speech and Signal Processing. Proceedings
Vol/bind2010
Sider (fra-til)21-24
ISSN1520-6149
DOI
StatusUdgivet - 14 mar. 2010
Begivenhed2010 IEEE International Conference on Acoustics, Speech, and Signal Processing - Dallas, USA
Varighed: 14 mar. 201017 mar. 2010

Konference

Konference2010 IEEE International Conference on Acoustics, Speech, and Signal Processing
LandUSA
ByDallas
Periode14/03/201017/03/2010

Fingerprint

Masks

Citer dette

@inproceedings{1e0a66c6e75e4c158b9212aa91a618d6,
title = "Improved single-channel speech separation using sinusoidal modeling",
abstract = "We present a novel single-channel separation approach to improve the separation performance while recovering the signals from a mixture. The key idea in this research is to employ a mixture estimator based on unconstrained modified sinusoidal parameters. Compared to the mixmax (binary mask) and Wiener filter (softmask) approaches, the proposed approach works independently of pitch estimates. Furthermore, it is observed that it can achieve acceptable perceptual speech quality with less cross-talk at different signal-tosignal ratios while bringing down the complexity by replacing STFT with sinusoidal parameters. Improvementsmade by the proposed approach are demonstrated by employing PESQ as our objective measureand MUSHRA listening test as our subjective evaluation.",
keywords = "Mixture estimation, single-channel speech separation, mask-based methods, speaker codebook",
author = "Pejman Mowlaee and Christensen, {Mads Gr{\ae}sb{\o}ll} and Jensen, {S{\o}ren Holdt}",
year = "2010",
month = "3",
day = "14",
doi = "10.1109/ICASSP.2010.5496263",
language = "English",
volume = "2010",
pages = "21--24",
journal = "I E E E International Conference on Acoustics, Speech and Signal Processing. Proceedings",
issn = "1520-6149",
publisher = "IEEE Signal Processing Society",

}

Improved single-channel speech separation using sinusoidal modeling. / Mowlaee, Pejman; Christensen, Mads Græsbøll; Jensen, Søren Holdt.

I: I E E E International Conference on Acoustics, Speech and Signal Processing. Proceedings, Bind 2010, 14.03.2010, s. 21-24.

Publikation: Bidrag til tidsskriftKonferenceartikel i tidsskriftForskningpeer review

TY - GEN

T1 - Improved single-channel speech separation using sinusoidal modeling

AU - Mowlaee, Pejman

AU - Christensen, Mads Græsbøll

AU - Jensen, Søren Holdt

PY - 2010/3/14

Y1 - 2010/3/14

N2 - We present a novel single-channel separation approach to improve the separation performance while recovering the signals from a mixture. The key idea in this research is to employ a mixture estimator based on unconstrained modified sinusoidal parameters. Compared to the mixmax (binary mask) and Wiener filter (softmask) approaches, the proposed approach works independently of pitch estimates. Furthermore, it is observed that it can achieve acceptable perceptual speech quality with less cross-talk at different signal-tosignal ratios while bringing down the complexity by replacing STFT with sinusoidal parameters. Improvementsmade by the proposed approach are demonstrated by employing PESQ as our objective measureand MUSHRA listening test as our subjective evaluation.

AB - We present a novel single-channel separation approach to improve the separation performance while recovering the signals from a mixture. The key idea in this research is to employ a mixture estimator based on unconstrained modified sinusoidal parameters. Compared to the mixmax (binary mask) and Wiener filter (softmask) approaches, the proposed approach works independently of pitch estimates. Furthermore, it is observed that it can achieve acceptable perceptual speech quality with less cross-talk at different signal-tosignal ratios while bringing down the complexity by replacing STFT with sinusoidal parameters. Improvementsmade by the proposed approach are demonstrated by employing PESQ as our objective measureand MUSHRA listening test as our subjective evaluation.

KW - Mixture estimation

KW - single-channel speech separation

KW - mask-based methods

KW - speaker codebook

U2 - 10.1109/ICASSP.2010.5496263

DO - 10.1109/ICASSP.2010.5496263

M3 - Conference article in Journal

VL - 2010

SP - 21

EP - 24

JO - I E E E International Conference on Acoustics, Speech and Signal Processing. Proceedings

JF - I E E E International Conference on Acoustics, Speech and Signal Processing. Proceedings

SN - 1520-6149

ER -