Estimation of Fundamental Frequencies in Stereophonic Music Mixtures

Research output: Contribution to journalJournal articleResearchpeer-review

Abstract

In this paper, a method for multi-pitch estimation of stereophonic mixtures of multiple harmonic signals, e.g., instrument recordings, is presented. The method is based on a signal model which includes the panning parameters of the sources in a stereophonic mixture, such as those applied artificially in a recording studio. If the sources in a mixture have different panning parameters, this diversity can be used to simplify the pitch estimation problem. The mixing parameters of the sources might be shared, resulting in a multi-pitch estimation problem, which is solved using an approach based on an expectation-maximization algorithm for Gaussian sources, where the fundamental frequencies and model orders are estimated jointly. The fundamental frequencies may be related, resulting in overlapping harmonics, complicating the estimation of the parameters. A codebook of magnitude amplitude vectors is trained on recordings of instruments playing single notes, and used when estimating the complex amplitudes of the components in the mixture. The proposed method is evaluated using stereophonic mixtures of real signals, and compared to state-of-the-art transcription and multi-pitch estimation methods. Experiments show an increase in performance when knowledge about the panning parameters is taken into account. The proposed method provides a full parametrization of the components of the observed signal, and can be used, e.g., for instrument tuning, editing purposes, altering single harmonic components in a mixture, and for audio effects.
Close

Details

In this paper, a method for multi-pitch estimation of stereophonic mixtures of multiple harmonic signals, e.g., instrument recordings, is presented. The method is based on a signal model which includes the panning parameters of the sources in a stereophonic mixture, such as those applied artificially in a recording studio. If the sources in a mixture have different panning parameters, this diversity can be used to simplify the pitch estimation problem. The mixing parameters of the sources might be shared, resulting in a multi-pitch estimation problem, which is solved using an approach based on an expectation-maximization algorithm for Gaussian sources, where the fundamental frequencies and model orders are estimated jointly. The fundamental frequencies may be related, resulting in overlapping harmonics, complicating the estimation of the parameters. A codebook of magnitude amplitude vectors is trained on recordings of instruments playing single notes, and used when estimating the complex amplitudes of the components in the mixture. The proposed method is evaluated using stereophonic mixtures of real signals, and compared to state-of-the-art transcription and multi-pitch estimation methods. Experiments show an increase in performance when knowledge about the panning parameters is taken into account. The proposed method provides a full parametrization of the components of the observed signal, and can be used, e.g., for instrument tuning, editing purposes, altering single harmonic components in a mixture, and for audio effects.
Original languageEnglish
JournalIEEE/ACM Transactions on Audio, Speech, and Language Processing
Volume27
Issue number2
Pages (from-to)296-310
ISSN2329-9290
DOI
Publication statusAccepted/In press - 2019
Publication categoryResearch
Peer-reviewedYes
ID: 287484672