Enhancement of Single-Channel Periodic Signals in the Time-Domain

Research output: Contribution to journalJournal articleResearchpeer-review

35 Citations (Scopus)
336 Downloads (Pure)

Abstract

Most state-of-the-art filtering methods for speech enhancement require an estimate of the noise statistics, but the noise statistics are difficult to estimate in practice when speech is present. Thus, nonstationary noise will have a detrimental impact on the performance of most speech enhancement filters. The impact of such noise can be reduced by using the signal statistics rather than the noise statistics in the filter design. For example, this is possible by assuming a harmonic model for the desired signal; while this model fits well for voiced speech, it will not be appropriate for unvoiced speech. That is, signal-dependent methods based on the signal statistics will introduce undesired distortion for some parts of speech compared to signal-independent methods based on the noise statistics. Since both the signal-independent and signal-dependent approaches to speech enhancement have advantages, it is relevant to combine them to reduce the impact of their individual disadvantages. In this paper, we give theoretical insights into the relationship between these different approaches, and these reveal a close relationship between the two approaches. This justifies joint use of such filtering methods which can be beneficial from a practical point of view. Our experimental results confirm that both signal-independent and signal-dependent approaches have advantages and that they are closely-related. Moreover, as a part of our experiments, we illustrate the practical usefulness of combining signal-independent and signal-dependent enhancement methods by applying such methods jointly on real-life speech.
Original languageEnglish
JournalI E E E Transactions on Audio, Speech and Language Processing
Volume20
Issue number7
Pages (from-to)1948-1963
Number of pages16
ISSN1558-7916
DOIs
Publication statusPublished - 2012

Fingerprint

Statistics
Speech enhancement
augmentation
statistics
filters
estimates
Experiments
harmonics

Cite this

@article{f014f260f4834b508f3856ab52660711,
title = "Enhancement of Single-Channel Periodic Signals in the Time-Domain",
abstract = "Most state-of-the-art filtering methods for speech enhancement require an estimate of the noise statistics, but the noise statistics are difficult to estimate in practice when speech is present. Thus, nonstationary noise will have a detrimental impact on the performance of most speech enhancement filters. The impact of such noise can be reduced by using the signal statistics rather than the noise statistics in the filter design. For example, this is possible by assuming a harmonic model for the desired signal; while this model fits well for voiced speech, it will not be appropriate for unvoiced speech. That is, signal-dependent methods based on the signal statistics will introduce undesired distortion for some parts of speech compared to signal-independent methods based on the noise statistics. Since both the signal-independent and signal-dependent approaches to speech enhancement have advantages, it is relevant to combine them to reduce the impact of their individual disadvantages. In this paper, we give theoretical insights into the relationship between these different approaches, and these reveal a close relationship between the two approaches. This justifies joint use of such filtering methods which can be beneficial from a practical point of view. Our experimental results confirm that both signal-independent and signal-dependent approaches have advantages and that they are closely-related. Moreover, as a part of our experiments, we illustrate the practical usefulness of combining signal-independent and signal-dependent enhancement methods by applying such methods jointly on real-life speech.",
author = "Jensen, {Jesper Rindom} and Jacob Benesty and Christensen, {Mads Gr{\ae}sb{\o}ll} and Jensen, {S{\o}ren Holdt}",
year = "2012",
doi = "10.1109/TASL.2012.2191957",
language = "English",
volume = "20",
pages = "1948--1963",
journal = "IEEE/ACM Transactions on Audio, Speech, and Language Processing",
issn = "2329-9290",
publisher = "IEEE Signal Processing Society",
number = "7",

}

Enhancement of Single-Channel Periodic Signals in the Time-Domain. / Jensen, Jesper Rindom; Benesty, Jacob; Christensen, Mads Græsbøll; Jensen, Søren Holdt.

In: I E E E Transactions on Audio, Speech and Language Processing, Vol. 20, No. 7, 2012, p. 1948-1963.

Research output: Contribution to journalJournal articleResearchpeer-review

TY - JOUR

T1 - Enhancement of Single-Channel Periodic Signals in the Time-Domain

AU - Jensen, Jesper Rindom

AU - Benesty, Jacob

AU - Christensen, Mads Græsbøll

AU - Jensen, Søren Holdt

PY - 2012

Y1 - 2012

N2 - Most state-of-the-art filtering methods for speech enhancement require an estimate of the noise statistics, but the noise statistics are difficult to estimate in practice when speech is present. Thus, nonstationary noise will have a detrimental impact on the performance of most speech enhancement filters. The impact of such noise can be reduced by using the signal statistics rather than the noise statistics in the filter design. For example, this is possible by assuming a harmonic model for the desired signal; while this model fits well for voiced speech, it will not be appropriate for unvoiced speech. That is, signal-dependent methods based on the signal statistics will introduce undesired distortion for some parts of speech compared to signal-independent methods based on the noise statistics. Since both the signal-independent and signal-dependent approaches to speech enhancement have advantages, it is relevant to combine them to reduce the impact of their individual disadvantages. In this paper, we give theoretical insights into the relationship between these different approaches, and these reveal a close relationship between the two approaches. This justifies joint use of such filtering methods which can be beneficial from a practical point of view. Our experimental results confirm that both signal-independent and signal-dependent approaches have advantages and that they are closely-related. Moreover, as a part of our experiments, we illustrate the practical usefulness of combining signal-independent and signal-dependent enhancement methods by applying such methods jointly on real-life speech.

AB - Most state-of-the-art filtering methods for speech enhancement require an estimate of the noise statistics, but the noise statistics are difficult to estimate in practice when speech is present. Thus, nonstationary noise will have a detrimental impact on the performance of most speech enhancement filters. The impact of such noise can be reduced by using the signal statistics rather than the noise statistics in the filter design. For example, this is possible by assuming a harmonic model for the desired signal; while this model fits well for voiced speech, it will not be appropriate for unvoiced speech. That is, signal-dependent methods based on the signal statistics will introduce undesired distortion for some parts of speech compared to signal-independent methods based on the noise statistics. Since both the signal-independent and signal-dependent approaches to speech enhancement have advantages, it is relevant to combine them to reduce the impact of their individual disadvantages. In this paper, we give theoretical insights into the relationship between these different approaches, and these reveal a close relationship between the two approaches. This justifies joint use of such filtering methods which can be beneficial from a practical point of view. Our experimental results confirm that both signal-independent and signal-dependent approaches have advantages and that they are closely-related. Moreover, as a part of our experiments, we illustrate the practical usefulness of combining signal-independent and signal-dependent enhancement methods by applying such methods jointly on real-life speech.

UR - http://www.scopus.com/inward/record.url?scp=84861157341&partnerID=8YFLogxK

U2 - 10.1109/TASL.2012.2191957

DO - 10.1109/TASL.2012.2191957

M3 - Journal article

VL - 20

SP - 1948

EP - 1963

JO - IEEE/ACM Transactions on Audio, Speech, and Language Processing

JF - IEEE/ACM Transactions on Audio, Speech, and Language Processing

SN - 2329-9290

IS - 7

ER -