Classification Accuracy Is Not Enough: On the Evaluation of Music Genre Recognition Systems

Bob L. Sturm

Research output: Contribution to journalJournal articleResearchpeer-review

37 Citations (Scopus)
902 Downloads (Pure)

Abstract

A recent review of the research literature evaluating music genre recognition (MGR) systems over the past two decades shows that most works (81\%) measure the capacity of a system to recognize genre by its classification accuracy. We show here, by implementing and testing three categorically different state-of-the-art MGR systems, that classification accuracy does not necessarily reflect the capacity of a system to recognize genre in musical signals. We argue that a more comprehensive analysis of behavior at the level of the music is needed to address the problem of MGR, and that measuring classification accuracy obscures the aim of MGR: to select labels indistinguishable from those a person would choose.
Original languageEnglish
JournalJournal of Intelligent Information Systems
Volume41
Issue number3
Pages (from-to)371-406
Number of pages36
ISSN0925-9902
DOIs
Publication statusPublished - 2013

Fingerprint

Labels
Testing

Cite this

@article{43faf99827754c67a7c17e81eab06696,
title = "Classification Accuracy Is Not Enough: On the Evaluation of Music Genre Recognition Systems",
abstract = "A recent review of the research literature evaluating music genre recognition (MGR) systems over the past two decades shows that most works (81\{\%}) measure the capacity of a system to recognize genre by its classification accuracy. We show here, by implementing and testing three categorically different state-of-the-art MGR systems, that classification accuracy does not necessarily reflect the capacity of a system to recognize genre in musical signals. We argue that a more comprehensive analysis of behavior at the level of the music is needed to address the problem of MGR, and that measuring classification accuracy obscures the aim of MGR: to select labels indistinguishable from those a person would choose.",
author = "Sturm, {Bob L.}",
year = "2013",
doi = "10.1007/s10844-013-0250-y",
language = "English",
volume = "41",
pages = "371--406",
journal = "Journal of Intelligent Information Systems",
issn = "0925-9902",
publisher = "Springer",
number = "3",

}

Classification Accuracy Is Not Enough : On the Evaluation of Music Genre Recognition Systems. / Sturm, Bob L.

In: Journal of Intelligent Information Systems, Vol. 41, No. 3, 2013, p. 371-406.

Research output: Contribution to journalJournal articleResearchpeer-review

TY - JOUR

T1 - Classification Accuracy Is Not Enough

T2 - On the Evaluation of Music Genre Recognition Systems

AU - Sturm, Bob L.

PY - 2013

Y1 - 2013

N2 - A recent review of the research literature evaluating music genre recognition (MGR) systems over the past two decades shows that most works (81\%) measure the capacity of a system to recognize genre by its classification accuracy. We show here, by implementing and testing three categorically different state-of-the-art MGR systems, that classification accuracy does not necessarily reflect the capacity of a system to recognize genre in musical signals. We argue that a more comprehensive analysis of behavior at the level of the music is needed to address the problem of MGR, and that measuring classification accuracy obscures the aim of MGR: to select labels indistinguishable from those a person would choose.

AB - A recent review of the research literature evaluating music genre recognition (MGR) systems over the past two decades shows that most works (81\%) measure the capacity of a system to recognize genre by its classification accuracy. We show here, by implementing and testing three categorically different state-of-the-art MGR systems, that classification accuracy does not necessarily reflect the capacity of a system to recognize genre in musical signals. We argue that a more comprehensive analysis of behavior at the level of the music is needed to address the problem of MGR, and that measuring classification accuracy obscures the aim of MGR: to select labels indistinguishable from those a person would choose.

U2 - 10.1007/s10844-013-0250-y

DO - 10.1007/s10844-013-0250-y

M3 - Journal article

VL - 41

SP - 371

EP - 406

JO - Journal of Intelligent Information Systems

JF - Journal of Intelligent Information Systems

SN - 0925-9902

IS - 3

ER -