Abstract
In this paper we present a new approach for binary and soft masks
used in single-channel speech separation. We present a novel approach
called the sinusoidal mask (binary mask and Wiener filter)
in a sinusoidal space. Theoretical analysis is presented for the proposed
method, and we show that the proposed method is able to minimize
the target speech distortion while suppressing the crosstalk to
a predetermined threshold. It is observed that compared to the STFTbased
masks, the proposed sinusoidal masks improve the separation
performance in terms of objective measures (SSNR and PESQ) and
are mostly preferred by listeners.
used in single-channel speech separation. We present a novel approach
called the sinusoidal mask (binary mask and Wiener filter)
in a sinusoidal space. Theoretical analysis is presented for the proposed
method, and we show that the proposed method is able to minimize
the target speech distortion while suppressing the crosstalk to
a predetermined threshold. It is observed that compared to the STFTbased
masks, the proposed sinusoidal masks improve the separation
performance in terms of objective measures (SSNR and PESQ) and
are mostly preferred by listeners.
Originalsprog | Engelsk |
---|---|
Tidsskrift | I E E E International Conference on Acoustics, Speech and Signal Processing. Proceedings |
Sider (fra-til) | 4262-4265 |
ISSN | 1520-6149 |
DOI | |
Status | Udgivet - 14 mar. 2010 |
Begivenhed | 2010 IEEE International Conference on Acoustics, Speech, and Signal Processing - Dallas, USA Varighed: 14 mar. 2010 → 17 mar. 2010 |
Konference
Konference | 2010 IEEE International Conference on Acoustics, Speech, and Signal Processing |
---|---|
Land/Område | USA |
By | Dallas |
Periode | 14/03/2010 → 17/03/2010 |