Abstrakt
In this paper, we present a novel supervised Non-negative Matrix
Factorization (NMF) speech enhancement method, which
is based on Hidden Markov Model (HMM) and Kullback-
Leibler (KL) divergence (NMF-HMM). Our algorithm applies
theHMMto capture the timing information, so the temporal dynamics
of speech signal can be considered by comparing with
the traditional NMF-based speech enhancement method. More
specifically, the sum of Poisson, leading to the KL divergence
measure, is used as the observation model for each state of
HMM. This ensures that the parameter update rule of the proposed
algorithm is identical to the multiplicative update rule,
which is quick and efficient. In the training stage, this update
rule is applied to train the NMF-HMM model. In the online enhancement
stage, a novel minimum mean-square error (MMSE)
estimator that combines the NMF-HMM is proposed to conduct
speech enhancement. The performance of the proposed
algorithm is evaluated by perceptual evaluation of speech quality
(PESQ) and short-timeobjective intelligibility (STOI). The
experimental results indicate that the STOI score of proposed
strategy is able to outperform 7% than current state-of-the-art
NMF-based speech enhancement methods.
Factorization (NMF) speech enhancement method, which
is based on Hidden Markov Model (HMM) and Kullback-
Leibler (KL) divergence (NMF-HMM). Our algorithm applies
theHMMto capture the timing information, so the temporal dynamics
of speech signal can be considered by comparing with
the traditional NMF-based speech enhancement method. More
specifically, the sum of Poisson, leading to the KL divergence
measure, is used as the observation model for each state of
HMM. This ensures that the parameter update rule of the proposed
algorithm is identical to the multiplicative update rule,
which is quick and efficient. In the training stage, this update
rule is applied to train the NMF-HMM model. In the online enhancement
stage, a novel minimum mean-square error (MMSE)
estimator that combines the NMF-HMM is proposed to conduct
speech enhancement. The performance of the proposed
algorithm is evaluated by perceptual evaluation of speech quality
(PESQ) and short-timeobjective intelligibility (STOI). The
experimental results indicate that the STOI score of proposed
strategy is able to outperform 7% than current state-of-the-art
NMF-based speech enhancement methods.
Originalsprog | Engelsk |
---|---|
Titel | Interspeech |
Antal sider | 5 |
Publikationsdato | 22 okt. 2020 |
Sider | 2667-2671 |
Status | Udgivet - 22 okt. 2020 |
Begivenhed | Interspeech 2020 - Shanghai, Kina Varighed: 25 okt. 2020 → 29 okt. 2020 |
Konference
Konference | Interspeech 2020 |
---|---|
Land | Kina |
By | Shanghai |
Periode | 25/10/2020 → 29/10/2020 |