The State of the Art Ten Years After a State of the Art: Future Research in Music Information Retrieval

Publikation: Forskning - peer reviewTidsskriftartikel

Abstrakt

A decade has passed since
the first review of research on a ``flagship application" of music information retrieval (MIR):
the problem of music genre recognition (MGR).
During this time, about 500 works addressing MGR have been published,
and at least 10 campaigns have been run to evaluate MGR systems,
which makes MGR one of the most researched areas of MIR.
So, where does MGR lie now?
We show that in spite of this massive amount of work,
MGR does not lie far from where it began,
and the paramount reason for this is that
most evaluation in MGR lacks validity.
We perform a case study of all published research
using the most-used benchmark dataset in MGR
during the past decade: {\em GTZAN}.
We show that none of the evaluations in these many works
is valid to produce conclusions with respect to {\em recognizing genre},
i.e., that a system is using criteria relevant to recognize genre.
In fact, the problems of validity in evaluation also affect
research in music emotion recognition and autotagging.
We conclude by discussing the implications of our work
for MGR and MIR in the next ten years.
Luk

Detaljer

A decade has passed since
the first review of research on a ``flagship application" of music information retrieval (MIR):
the problem of music genre recognition (MGR).
During this time, about 500 works addressing MGR have been published,
and at least 10 campaigns have been run to evaluate MGR systems,
which makes MGR one of the most researched areas of MIR.
So, where does MGR lie now?
We show that in spite of this massive amount of work,
MGR does not lie far from where it began,
and the paramount reason for this is that
most evaluation in MGR lacks validity.
We perform a case study of all published research
using the most-used benchmark dataset in MGR
during the past decade: {\em GTZAN}.
We show that none of the evaluations in these many works
is valid to produce conclusions with respect to {\em recognizing genre},
i.e., that a system is using criteria relevant to recognize genre.
In fact, the problems of validity in evaluation also affect
research in music emotion recognition and autotagging.
We conclude by discussing the implications of our work
for MGR and MIR in the next ten years.
OriginalsprogEngelsk
TidsskriftJournal of New Music Research
Vol/bind43
Tidsskriftsnummer2
Sider (fra-til)147-172
ISSN0929-8215
DOI
StatusUdgivet - 2014
PublikationsartForskning
Peer reviewJa

Download-statistik

Ingen data tilgængelig
ID: 168292223