A Simple Method to Determine if a Music Information Retrieval System is a "Horse"

Bob L. Sturm

Research output: Contribution to journalJournal articleResearchpeer-review

31 Citations (Scopus)

Abstract

We propose and demonstrate a simple method to determine if
a music information retrieval (MIR) system is
using factors irrelevant to the task for which it is designed.
This is of critical importance to certain use cases,
but cannot be accomplished using standard approaches to evaluation in MIR.
Akin to the controlled experiments
designed to test the intellect of
the famous horse ``Clever Hans'',
we perform two experiments to show
how three state-of-the-art music genre recognition (MGR)
and music emotion recognition (MER) systems
are relying on factors confounded with the ``ground truth'' labels of a dataset.
We make available a reproducible research package
so that others can perform the same experiments
with other MIR systems.
Original languageEnglish
JournalI E E E Transactions on Multimedia
Volume16
Issue number6
Pages (from-to)1636-1644
Number of pages9
ISSN1520-9210
DOIs
Publication statusPublished - Oct 2014

Fingerprint

Computer music
Information retrieval systems
Information retrieval
Labels
Experiments

Cite this

@article{4425549d31d5432ba2cdd896022be8ad,
title = "A Simple Method to Determine if a Music Information Retrieval System is a {"}Horse{"}",
abstract = "We propose and demonstrate a simple method to determine ifa music information retrieval (MIR) system is using factors irrelevant to the task for which it is designed.This is of critical importance to certain use cases,but cannot be accomplished using standard approaches to evaluation in MIR.Akin to the controlled experiments designed to test the intellect of the famous horse ``Clever Hans'',we perform two experiments to showhow three state-of-the-art music genre recognition (MGR) and music emotion recognition (MER) systems are relying on factors confounded with the ``ground truth'' labels of a dataset.We make available a reproducible research packageso that others can perform the same experimentswith other MIR systems.",
author = "Sturm, {Bob L.}",
year = "2014",
month = "10",
doi = "10.1109/TMM.2014.2330697",
language = "English",
volume = "16",
pages = "1636--1644",
journal = "I E E E Transactions on Multimedia",
issn = "1520-9210",
publisher = "IEEE",
number = "6",

}

A Simple Method to Determine if a Music Information Retrieval System is a "Horse". / Sturm, Bob L.

In: I E E E Transactions on Multimedia, Vol. 16, No. 6, 10.2014, p. 1636-1644.

Research output: Contribution to journalJournal articleResearchpeer-review

TY - JOUR

T1 - A Simple Method to Determine if a Music Information Retrieval System is a "Horse"

AU - Sturm, Bob L.

PY - 2014/10

Y1 - 2014/10

N2 - We propose and demonstrate a simple method to determine ifa music information retrieval (MIR) system is using factors irrelevant to the task for which it is designed.This is of critical importance to certain use cases,but cannot be accomplished using standard approaches to evaluation in MIR.Akin to the controlled experiments designed to test the intellect of the famous horse ``Clever Hans'',we perform two experiments to showhow three state-of-the-art music genre recognition (MGR) and music emotion recognition (MER) systems are relying on factors confounded with the ``ground truth'' labels of a dataset.We make available a reproducible research packageso that others can perform the same experimentswith other MIR systems.

AB - We propose and demonstrate a simple method to determine ifa music information retrieval (MIR) system is using factors irrelevant to the task for which it is designed.This is of critical importance to certain use cases,but cannot be accomplished using standard approaches to evaluation in MIR.Akin to the controlled experiments designed to test the intellect of the famous horse ``Clever Hans'',we perform two experiments to showhow three state-of-the-art music genre recognition (MGR) and music emotion recognition (MER) systems are relying on factors confounded with the ``ground truth'' labels of a dataset.We make available a reproducible research packageso that others can perform the same experimentswith other MIR systems.

U2 - 10.1109/TMM.2014.2330697

DO - 10.1109/TMM.2014.2330697

M3 - Journal article

VL - 16

SP - 1636

EP - 1644

JO - I E E E Transactions on Multimedia

JF - I E E E Transactions on Multimedia

SN - 1520-9210

IS - 6

ER -