Sinusoidal masks for single channel speech separation

Pejman Mowlaee; Mads Græsbøll Christensen; Søren Holdt Jensen

doi:10.1109/ICASSP.2010.5495679

Sinusoidal masks for single channel speech separation

Pejman Mowlaee, Mads Græsbøll Christensen, Søren Holdt Jensen

Publikation: Bidrag til tidsskrift › Konferenceartikel i tidsskrift › Forskning › peer review

8 Citationer (Scopus)

482 Downloads (Pure)

Abstract

In this paper we present a new approach for binary and soft masks
used in single-channel speech separation. We present a novel approach
called the sinusoidal mask (binary mask and Wiener filter)
in a sinusoidal space. Theoretical analysis is presented for the proposed
method, and we show that the proposed method is able to minimize
the target speech distortion while suppressing the crosstalk to
a predetermined threshold. It is observed that compared to the STFTbased
masks, the proposed sinusoidal masks improve the separation
performance in terms of objective measures (SSNR and PESQ) and
are mostly preferred by listeners.

Originalsprog	Engelsk
Tidsskrift	I E E E International Conference on Acoustics, Speech and Signal Processing. Proceedings
Sider (fra-til)	4262-4265
ISSN	1520-6149
DOI	https://doi.org/10.1109/ICASSP.2010.5495679
Status	Udgivet - 14 mar. 2010
Begivenhed	2010 IEEE International Conference on Acoustics, Speech, and Signal Processing - Dallas, USA Varighed: 14 mar. 2010 → 17 mar. 2010

Konference

Konference	2010 IEEE International Conference on Acoustics, Speech, and Signal Processing
Land/Område	USA
By	Dallas
Periode	14/03/2010 → 17/03/2010

Adgang til dokumentet

10.1109/ICASSP.2010.5495679

Icassp2010bAccepteret manuskript, 177 KB

AUB Link

Søg efter materialet i Aalborg Universitetsbiblioteks søgemaskine

Citationsformater

@inproceedings{a0d76a6489ae4f54af7cfdafa8efb4f4,

title = "Sinusoidal masks for single channel speech separation",

abstract = "In this paper we present a new approach for binary and soft masksused in single-channel speech separation. We present a novel approachcalled the sinusoidal mask (binary mask and Wiener filter)in a sinusoidal space. Theoretical analysis is presented for the proposedmethod, and we show that the proposed method is able to minimizethe target speech distortion while suppressing the crosstalk toa predetermined threshold. It is observed that compared to the STFTbasedmasks, the proposed sinusoidal masks improve the separationperformance in terms of objective measures (SSNR and PESQ) andare mostly preferred by listeners.",

keywords = "Mask-based method, mixture estimator, sinusoidal mask, single-channel speech separation",

author = "Pejman Mowlaee and Christensen, {Mads Gr{\ae}sb{\o}ll} and Jensen, {S{\o}ren Holdt}",

year = "2010",

month = mar,

day = "14",

doi = "10.1109/ICASSP.2010.5495679",

language = "English",

pages = "4262--4265",

journal = "I E E E International Conference on Acoustics, Speech and Signal Processing. Proceedings",

issn = "1520-6149",

publisher = "IEEE Signal Processing Society",

note = "2010 IEEE International Conference on Acoustics, Speech, and Signal Processing ; Conference date: 14-03-2010 Through 17-03-2010",

}

TY - GEN

T1 - Sinusoidal masks for single channel speech separation

AU - Mowlaee, Pejman

AU - Christensen, Mads Græsbøll

AU - Jensen, Søren Holdt

PY - 2010/3/14

Y1 - 2010/3/14

N2 - In this paper we present a new approach for binary and soft masksused in single-channel speech separation. We present a novel approachcalled the sinusoidal mask (binary mask and Wiener filter)in a sinusoidal space. Theoretical analysis is presented for the proposedmethod, and we show that the proposed method is able to minimizethe target speech distortion while suppressing the crosstalk toa predetermined threshold. It is observed that compared to the STFTbasedmasks, the proposed sinusoidal masks improve the separationperformance in terms of objective measures (SSNR and PESQ) andare mostly preferred by listeners.

AB - In this paper we present a new approach for binary and soft masksused in single-channel speech separation. We present a novel approachcalled the sinusoidal mask (binary mask and Wiener filter)in a sinusoidal space. Theoretical analysis is presented for the proposedmethod, and we show that the proposed method is able to minimizethe target speech distortion while suppressing the crosstalk toa predetermined threshold. It is observed that compared to the STFTbasedmasks, the proposed sinusoidal masks improve the separationperformance in terms of objective measures (SSNR and PESQ) andare mostly preferred by listeners.

KW - Mask-based method

KW - mixture estimator

KW - sinusoidal mask

KW - single-channel speech separation

U2 - 10.1109/ICASSP.2010.5495679

DO - 10.1109/ICASSP.2010.5495679

M3 - Conference article in Journal

SN - 1520-6149

SP - 4262

EP - 4265

JO - I E E E International Conference on Acoustics, Speech and Signal Processing. Proceedings

JF - I E E E International Conference on Acoustics, Speech and Signal Processing. Proceedings

T2 - 2010 IEEE International Conference on Acoustics, Speech, and Signal Processing

Y2 - 14 March 2010 through 17 March 2010

ER -

Sinusoidal masks for single channel speech separation

Abstract

Konference

Adgang til dokumentet

AUB Link

Fingeraftryk

Citationsformater