A Simple Method to Determine if a Music Information Retrieval System is a "Horse"

Research output: Contribution to journalJournal articleResearchpeer-review

Abstract

We propose and demonstrate a simple method to determine if
a music information retrieval (MIR) system is
using factors irrelevant to the task for which it is designed.
This is of critical importance to certain use cases,
but cannot be accomplished using standard approaches to evaluation in MIR.
Akin to the controlled experiments
designed to test the intellect of
the famous horse ``Clever Hans'',
we perform two experiments to show
how three state-of-the-art music genre recognition (MGR)
and music emotion recognition (MER) systems
are relying on factors confounded with the ``ground truth'' labels of a dataset.
We make available a reproducible research package
so that others can perform the same experiments
with other MIR systems.
Close

Details

We propose and demonstrate a simple method to determine if
a music information retrieval (MIR) system is
using factors irrelevant to the task for which it is designed.
This is of critical importance to certain use cases,
but cannot be accomplished using standard approaches to evaluation in MIR.
Akin to the controlled experiments
designed to test the intellect of
the famous horse ``Clever Hans'',
we perform two experiments to show
how three state-of-the-art music genre recognition (MGR)
and music emotion recognition (MER) systems
are relying on factors confounded with the ``ground truth'' labels of a dataset.
We make available a reproducible research package
so that others can perform the same experiments
with other MIR systems.
Original languageEnglish
JournalI E E E Transactions on Multimedia
Volume16
Issue number6
Pages (from-to)1636-1644
Number of pages9
ISSN1520-9210
DOI
StatePublished - Oct 2014
Publication categoryResearch
Peer-reviewedYes
ID: 168292350