The Minimum Overlap-Gap Algorithm for Speech Enhancement

Poul Hoang; Zheng Hua Tan; Jan Mark De Haan; Jesper Jensen

doi:10.1109/ACCESS.2022.3147514

The Minimum Overlap-Gap Algorithm for Speech Enhancement

Poul Hoang^*, Zheng Hua Tan, Jan Mark De Haan, Jesper Jensen

^*Corresponding author for this work

Research output: Contribution to journal › Journal article › Research › peer-review

3 Citations (Scopus)

77 Downloads (Pure)

Abstract

In this paper, we propose a novel speech enhancement paradigm which can effectively solve the problem of retrieving a desired speech signal in a multi-talker environment. The proposed speech enhancement paradigm involves a three-step procedure consisting of separation, ranking, and enhancement. First, a speech separation system – which could be a conventional spatial filter bank or more advanced separation systems – separates mixtures of speech signals captured by microphones into speech signals from candidate speakers. Next, novel ranking algorithms – proposed in this paper – are applied to determine the talker-of-interest amongst the separated speech signals. Finally, the speech signal of the talker-of-interest is estimated as a linear combination of the separated signals, whose weights are determined by the ranking algorithms. We propose ranking algorithms, which exploit turn-taking patterns between conversational partners in order to determine the talker-of-interest amongst competing speakers. Unlike some existing solutions, our ranking algorithms do not require access to additional sensors, e.g., EEG electrodes, cameras, etc., but only rely on microphone signals. Specifically, the proposed algorithms rank the separated speech signals based on the probability of speech overlaps and gaps with the user’s own voice. The speech signal with highest ranking is the talker with minimum probability of speech overlap and gap with the user’s own voice. The proposed ranking algorithms are shown highly effective at determining the talker-of-interest, since conversational partners, i.e., the user and the talker-of-interest, behaviorally avoid speech overlaps and gaps. We evaluate the proposed speech enhancement paradigm in two practical hearing aid related applications, where the objective is to enhance a speech signal of a conversational partner in a multi-talker environment. The results of the evaluation demonstrate that the proposed speech enhancement systems in both applications significantly outperform conventional speech enhancement systems.

Original language	English
Journal	IEEE Access
Volume	10
Pages (from-to)	14698-14716
Number of pages	19
ISSN	2169-3536
DOIs	https://doi.org/10.1109/ACCESS.2022.3147514
Publication status	Published - 2022

Bibliographical note

Publisher Copyright:
© 2013 IEEE

Keywords

Direction-of-arrival estimation
Estimation
Licenses
Microphones
Noise measurement
Sensors
Speech enhancement

Access to Document

10.1109/ACCESS.2022.3147514Licence: CC BY 4.0

Open Acces articleFinal published version, 1.47 MBLicence: CC BY 4.0

AUB Link

Search for the material in Aalborg University Library's search engine

Cite this

@article{5d3eeb2f01ca4d9fb493a98bf9a4d5a3,

title = "The Minimum Overlap-Gap Algorithm for Speech Enhancement",

abstract = "In this paper, we propose a novel speech enhancement paradigm which can effectively solve the problem of retrieving a desired speech signal in a multi-talker environment. The proposed speech enhancement paradigm involves a three-step procedure consisting of separation, ranking, and enhancement. First, a speech separation system – which could be a conventional spatial filter bank or more advanced separation systems – separates mixtures of speech signals captured by microphones into speech signals from candidate speakers. Next, novel ranking algorithms – proposed in this paper – are applied to determine the talker-of-interest amongst the separated speech signals. Finally, the speech signal of the talker-of-interest is estimated as a linear combination of the separated signals, whose weights are determined by the ranking algorithms. We propose ranking algorithms, which exploit turn-taking patterns between conversational partners in order to determine the talker-of-interest amongst competing speakers. Unlike some existing solutions, our ranking algorithms do not require access to additional sensors, e.g., EEG electrodes, cameras, etc., but only rely on microphone signals. Specifically, the proposed algorithms rank the separated speech signals based on the probability of speech overlaps and gaps with the user{\textquoteright}s own voice. The speech signal with highest ranking is the talker with minimum probability of speech overlap and gap with the user{\textquoteright}s own voice. The proposed ranking algorithms are shown highly effective at determining the talker-of-interest, since conversational partners, i.e., the user and the talker-of-interest, behaviorally avoid speech overlaps and gaps. We evaluate the proposed speech enhancement paradigm in two practical hearing aid related applications, where the objective is to enhance a speech signal of a conversational partner in a multi-talker environment. The results of the evaluation demonstrate that the proposed speech enhancement systems in both applications significantly outperform conventional speech enhancement systems.",

keywords = "Direction-of-arrival estimation, Estimation, Licenses, Microphones, Noise measurement, Sensors, Speech enhancement",

author = "Poul Hoang and Tan, {Zheng Hua} and {De Haan}, {Jan Mark} and Jesper Jensen",

note = "Publisher Copyright: {\textcopyright} 2013 IEEE",

year = "2022",

doi = "10.1109/ACCESS.2022.3147514",

language = "English",

volume = "10",

pages = "14698--14716",

journal = "IEEE Access",

issn = "2169-3536",

publisher = "IEEE",

}

TY - JOUR

T1 - The Minimum Overlap-Gap Algorithm for Speech Enhancement

AU - Hoang, Poul

AU - Tan, Zheng Hua

AU - De Haan, Jan Mark

AU - Jensen, Jesper

PY - 2022

Y1 - 2022

N2 - In this paper, we propose a novel speech enhancement paradigm which can effectively solve the problem of retrieving a desired speech signal in a multi-talker environment. The proposed speech enhancement paradigm involves a three-step procedure consisting of separation, ranking, and enhancement. First, a speech separation system – which could be a conventional spatial filter bank or more advanced separation systems – separates mixtures of speech signals captured by microphones into speech signals from candidate speakers. Next, novel ranking algorithms – proposed in this paper – are applied to determine the talker-of-interest amongst the separated speech signals. Finally, the speech signal of the talker-of-interest is estimated as a linear combination of the separated signals, whose weights are determined by the ranking algorithms. We propose ranking algorithms, which exploit turn-taking patterns between conversational partners in order to determine the talker-of-interest amongst competing speakers. Unlike some existing solutions, our ranking algorithms do not require access to additional sensors, e.g., EEG electrodes, cameras, etc., but only rely on microphone signals. Specifically, the proposed algorithms rank the separated speech signals based on the probability of speech overlaps and gaps with the user’s own voice. The speech signal with highest ranking is the talker with minimum probability of speech overlap and gap with the user’s own voice. The proposed ranking algorithms are shown highly effective at determining the talker-of-interest, since conversational partners, i.e., the user and the talker-of-interest, behaviorally avoid speech overlaps and gaps. We evaluate the proposed speech enhancement paradigm in two practical hearing aid related applications, where the objective is to enhance a speech signal of a conversational partner in a multi-talker environment. The results of the evaluation demonstrate that the proposed speech enhancement systems in both applications significantly outperform conventional speech enhancement systems.

AB - In this paper, we propose a novel speech enhancement paradigm which can effectively solve the problem of retrieving a desired speech signal in a multi-talker environment. The proposed speech enhancement paradigm involves a three-step procedure consisting of separation, ranking, and enhancement. First, a speech separation system – which could be a conventional spatial filter bank or more advanced separation systems – separates mixtures of speech signals captured by microphones into speech signals from candidate speakers. Next, novel ranking algorithms – proposed in this paper – are applied to determine the talker-of-interest amongst the separated speech signals. Finally, the speech signal of the talker-of-interest is estimated as a linear combination of the separated signals, whose weights are determined by the ranking algorithms. We propose ranking algorithms, which exploit turn-taking patterns between conversational partners in order to determine the talker-of-interest amongst competing speakers. Unlike some existing solutions, our ranking algorithms do not require access to additional sensors, e.g., EEG electrodes, cameras, etc., but only rely on microphone signals. Specifically, the proposed algorithms rank the separated speech signals based on the probability of speech overlaps and gaps with the user’s own voice. The speech signal with highest ranking is the talker with minimum probability of speech overlap and gap with the user’s own voice. The proposed ranking algorithms are shown highly effective at determining the talker-of-interest, since conversational partners, i.e., the user and the talker-of-interest, behaviorally avoid speech overlaps and gaps. We evaluate the proposed speech enhancement paradigm in two practical hearing aid related applications, where the objective is to enhance a speech signal of a conversational partner in a multi-talker environment. The results of the evaluation demonstrate that the proposed speech enhancement systems in both applications significantly outperform conventional speech enhancement systems.

KW - Direction-of-arrival estimation

KW - Estimation

KW - Licenses

KW - Microphones

KW - Noise measurement

KW - Sensors

KW - Speech enhancement

UR - http://www.scopus.com/inward/record.url?scp=85124099108&partnerID=8YFLogxK

U2 - 10.1109/ACCESS.2022.3147514

DO - 10.1109/ACCESS.2022.3147514

M3 - Journal article

AN - SCOPUS:85124099108

SN - 2169-3536

VL - 10

SP - 14698

EP - 14716

JO - IEEE Access

JF - IEEE Access

ER -

The Minimum Overlap-Gap Algorithm for Speech Enhancement

Abstract

Bibliographical note

Keywords

Access to Document

AUB Link

Other files and links

Fingerprint

Cite this