A Phase Vocoder Based on Nonstationary Gabor Frames

Emil Solsbæk Ottosen; Monika Dörfler

doi:10.1109/TASLP.2017.2750767

A Phase Vocoder Based on Nonstationary Gabor Frames

Emil Solsbæk Ottosen, Monika Dörfler

Research output: Contribution to journal › Journal article › Research › peer-review

10 Citations (Scopus)

Abstract

We propose a new algorithm for time stretching
music signals based on the theory of nonstationary Gabor
frames (NSGFs). The algorithm extends the techniques of the
classical phase vocoder (PV) by incorporating adaptive timefrequency
(TF) representations and adaptive phase locking. The
adaptive TF representations imply good time resolution for the
onsets of attack transients and good frequency resolution for
the sinusoidal components. We estimate the phase values only
at peak channels and the remaining phases are then locked to
the values of the peaks in an adaptive manner. During attack
transients we keep the stretch factor equal to one and we propose
a new strategy for determining which channels are relevant
for reinitializing the corresponding phase values. In contrast to
previously published algorithms we use a non-uniform NSGF to
obtain a low redundancy of the corresponding TF representation.
We show that with just three times as many TF coefficients
as signal samples, artifacts such as phasiness and transient
smearing can be greatly reduced compared to the classical PV.
The proposed algorithm is tested on both synthetic and real
world signals and compared with state of the art algorithms in
a reproducible manner.

Original language	English
Journal	I E E E Transactions on Audio, Speech and Language Processing
Volume	25
Issue number	11
Pages (from-to)	2199-2208
Number of pages	10
ISSN	1558-7916
DOIs	https://doi.org/10.1109/TASLP.2017.2750767
Publication status	Published - Sept 2017

Keywords

Phase vocoder
nonstationary Gabor frames
Time-frequency analysis
Gabor theory
Time stretching

Access to Document

10.1109/TASLP.2017.2750767

AUB Link

Search for the material in Aalborg University Library's search engine

Cite this

@article{104edec4d4174433a61da4e357892761,

title = "A Phase Vocoder Based on Nonstationary Gabor Frames",

abstract = "We propose a new algorithm for time stretchingmusic signals based on the theory of nonstationary Gaborframes (NSGFs). The algorithm extends the techniques of theclassical phase vocoder (PV) by incorporating adaptive timefrequency(TF) representations and adaptive phase locking. Theadaptive TF representations imply good time resolution for theonsets of attack transients and good frequency resolution forthe sinusoidal components. We estimate the phase values onlyat peak channels and the remaining phases are then locked tothe values of the peaks in an adaptive manner. During attacktransients we keep the stretch factor equal to one and we proposea new strategy for determining which channels are relevantfor reinitializing the corresponding phase values. In contrast topreviously published algorithms we use a non-uniform NSGF toobtain a low redundancy of the corresponding TF representation.We show that with just three times as many TF coefficientsas signal samples, artifacts such as phasiness and transientsmearing can be greatly reduced compared to the classical PV.The proposed algorithm is tested on both synthetic and realworld signals and compared with state of the art algorithms ina reproducible manner.",

keywords = "Phase vocoder, nonstationary Gabor frames, Time-frequency analysis, Gabor theory, Time stretching",

author = "Ottosen, {Emil Solsb{\ae}k} and Monika D{\"o}rfler",

year = "2017",

month = sep,

doi = "10.1109/TASLP.2017.2750767",

language = "English",

volume = "25",

pages = "2199--2208",

journal = "I E E E Transactions on Audio, Speech and Language Processing",

issn = "1558-7916",

publisher = "IEEE Signal Processing Society",

number = "11",

}

TY - JOUR

T1 - A Phase Vocoder Based on Nonstationary Gabor Frames

AU - Ottosen, Emil Solsbæk

AU - Dörfler, Monika

PY - 2017/9

Y1 - 2017/9

N2 - We propose a new algorithm for time stretchingmusic signals based on the theory of nonstationary Gaborframes (NSGFs). The algorithm extends the techniques of theclassical phase vocoder (PV) by incorporating adaptive timefrequency(TF) representations and adaptive phase locking. Theadaptive TF representations imply good time resolution for theonsets of attack transients and good frequency resolution forthe sinusoidal components. We estimate the phase values onlyat peak channels and the remaining phases are then locked tothe values of the peaks in an adaptive manner. During attacktransients we keep the stretch factor equal to one and we proposea new strategy for determining which channels are relevantfor reinitializing the corresponding phase values. In contrast topreviously published algorithms we use a non-uniform NSGF toobtain a low redundancy of the corresponding TF representation.We show that with just three times as many TF coefficientsas signal samples, artifacts such as phasiness and transientsmearing can be greatly reduced compared to the classical PV.The proposed algorithm is tested on both synthetic and realworld signals and compared with state of the art algorithms ina reproducible manner.

AB - We propose a new algorithm for time stretchingmusic signals based on the theory of nonstationary Gaborframes (NSGFs). The algorithm extends the techniques of theclassical phase vocoder (PV) by incorporating adaptive timefrequency(TF) representations and adaptive phase locking. Theadaptive TF representations imply good time resolution for theonsets of attack transients and good frequency resolution forthe sinusoidal components. We estimate the phase values onlyat peak channels and the remaining phases are then locked tothe values of the peaks in an adaptive manner. During attacktransients we keep the stretch factor equal to one and we proposea new strategy for determining which channels are relevantfor reinitializing the corresponding phase values. In contrast topreviously published algorithms we use a non-uniform NSGF toobtain a low redundancy of the corresponding TF representation.We show that with just three times as many TF coefficientsas signal samples, artifacts such as phasiness and transientsmearing can be greatly reduced compared to the classical PV.The proposed algorithm is tested on both synthetic and realworld signals and compared with state of the art algorithms ina reproducible manner.

KW - Phase vocoder

KW - nonstationary Gabor frames

KW - Time-frequency analysis

KW - Gabor theory

KW - Time stretching

U2 - 10.1109/TASLP.2017.2750767

DO - 10.1109/TASLP.2017.2750767

M3 - Journal article

SN - 1558-7916

VL - 25

SP - 2199

EP - 2208

JO - I E E E Transactions on Audio, Speech and Language Processing

JF - I E E E Transactions on Audio, Speech and Language Processing

IS - 11

ER -

A Phase Vocoder Based on Nonstationary Gabor Frames

Abstract

Keywords

Access to Document

AUB Link

Fingerprint

Cite this