Sinusoidal masks for single channel speech separation

Publikation: Bidrag til tidsskriftKonferenceartikel i tidsskriftForskningpeer review

7 Citationer (Scopus)
218 Downloads (Pure)

Resumé

In this paper we present a new approach for binary and soft masks
used in single-channel speech separation. We present a novel approach
called the sinusoidal mask (binary mask and Wiener filter)
in a sinusoidal space. Theoretical analysis is presented for the proposed
method, and we show that the proposed method is able to minimize
the target speech distortion while suppressing the crosstalk to
a predetermined threshold. It is observed that compared to the STFTbased
masks, the proposed sinusoidal masks improve the separation
performance in terms of objective measures (SSNR and PESQ) and
are mostly preferred by listeners.
OriginalsprogEngelsk
TidsskriftI E E E International Conference on Acoustics, Speech and Signal Processing. Proceedings
Sider (fra-til)4262-4265
ISSN1520-6149
DOI
StatusUdgivet - 14 mar. 2010
Begivenhed2010 IEEE International Conference on Acoustics, Speech, and Signal Processing - Dallas, USA
Varighed: 14 mar. 201017 mar. 2010

Konference

Konference2010 IEEE International Conference on Acoustics, Speech, and Signal Processing
LandUSA
ByDallas
Periode14/03/201017/03/2010

Fingerprint

Masks
Crosstalk

Citer dette

@inproceedings{a0d76a6489ae4f54af7cfdafa8efb4f4,
title = "Sinusoidal masks for single channel speech separation",
abstract = "In this paper we present a new approach for binary and soft masksused in single-channel speech separation. We present a novel approachcalled the sinusoidal mask (binary mask and Wiener filter)in a sinusoidal space. Theoretical analysis is presented for the proposedmethod, and we show that the proposed method is able to minimizethe target speech distortion while suppressing the crosstalk toa predetermined threshold. It is observed that compared to the STFTbasedmasks, the proposed sinusoidal masks improve the separationperformance in terms of objective measures (SSNR and PESQ) andare mostly preferred by listeners.",
keywords = "Mask-based method, mixture estimator, sinusoidal mask, single-channel speech separation",
author = "Pejman Mowlaee and Christensen, {Mads Gr{\ae}sb{\o}ll} and Jensen, {S{\o}ren Holdt}",
year = "2010",
month = "3",
day = "14",
doi = "10.1109/ICASSP.2010.5495679",
language = "English",
pages = "4262--4265",
journal = "I E E E International Conference on Acoustics, Speech and Signal Processing. Proceedings",
issn = "1520-6149",
publisher = "IEEE Signal Processing Society",

}

Sinusoidal masks for single channel speech separation. / Mowlaee, Pejman; Christensen, Mads Græsbøll; Jensen, Søren Holdt.

I: I E E E International Conference on Acoustics, Speech and Signal Processing. Proceedings, 14.03.2010, s. 4262-4265.

Publikation: Bidrag til tidsskriftKonferenceartikel i tidsskriftForskningpeer review

TY - GEN

T1 - Sinusoidal masks for single channel speech separation

AU - Mowlaee, Pejman

AU - Christensen, Mads Græsbøll

AU - Jensen, Søren Holdt

PY - 2010/3/14

Y1 - 2010/3/14

N2 - In this paper we present a new approach for binary and soft masksused in single-channel speech separation. We present a novel approachcalled the sinusoidal mask (binary mask and Wiener filter)in a sinusoidal space. Theoretical analysis is presented for the proposedmethod, and we show that the proposed method is able to minimizethe target speech distortion while suppressing the crosstalk toa predetermined threshold. It is observed that compared to the STFTbasedmasks, the proposed sinusoidal masks improve the separationperformance in terms of objective measures (SSNR and PESQ) andare mostly preferred by listeners.

AB - In this paper we present a new approach for binary and soft masksused in single-channel speech separation. We present a novel approachcalled the sinusoidal mask (binary mask and Wiener filter)in a sinusoidal space. Theoretical analysis is presented for the proposedmethod, and we show that the proposed method is able to minimizethe target speech distortion while suppressing the crosstalk toa predetermined threshold. It is observed that compared to the STFTbasedmasks, the proposed sinusoidal masks improve the separationperformance in terms of objective measures (SSNR and PESQ) andare mostly preferred by listeners.

KW - Mask-based method

KW - mixture estimator

KW - sinusoidal mask

KW - single-channel speech separation

U2 - 10.1109/ICASSP.2010.5495679

DO - 10.1109/ICASSP.2010.5495679

M3 - Conference article in Journal

SP - 4262

EP - 4265

JO - I E E E International Conference on Acoustics, Speech and Signal Processing. Proceedings

JF - I E E E International Conference on Acoustics, Speech and Signal Processing. Proceedings

SN - 1520-6149

ER -