Are deep neural networks really learning relevant features?

Corey Kereliuk, Bob L. Sturm, Jan Larsen

Research output: Contribution to conference without publisher/journalConference abstract for conferenceResearchpeer-review

Abstract

In recent years deep neural networks (DNNs) have become a popular choice for audio content analysis. This may be attributed to various factors including advancements in training algorithms, computational power, and the potential for DNNs to implicitly learn a set of feature detectors. We have recently re-examined two works \cite{sigtiaimproved}\cite{hamel2010learning} that consider DNNs for the task of music genre recognition (MGR). These papers conclude that frame-level features learned by DNNs offer an improvement over traditional, hand-crafted features such as Mel-frequency cepstrum coefficients (MFCCs). However, these conclusions were drawn based on training/testing using the GTZAN dataset, which is now known to contain several flaws including replicated observations and artists \cite{sturm2012analysis}. We illustrate how considering these flaws dramatically changes the results, which leads one to question the degree to which the learned frame-level features are actually useful for MGR. We make available a reproducible software package allowing other researchers to completely duplicate our figures and results.
Original languageEnglish
Publication date2015
Publication statusPublished - 2015
EventDigital Music Research Network 9 - Queen Mary University of London, London, United Kingdom
Duration: 16 Dec 201416 Dec 2014

Workshop

WorkshopDigital Music Research Network 9
LocationQueen Mary University of London
CountryUnited Kingdom
CityLondon
Period16/12/201416/12/2014

Fingerprint

Defects
Software packages
Detectors
Deep neural networks
Testing

Cite this

Kereliuk, C., Sturm, B. L., & Larsen, J. (2015). Are deep neural networks really learning relevant features?. Abstract from Digital Music Research Network 9, London, United Kingdom.
Kereliuk, Corey ; Sturm, Bob L. ; Larsen, Jan. / Are deep neural networks really learning relevant features?. Abstract from Digital Music Research Network 9, London, United Kingdom.
@conference{d96b54366685463cae9dbc5458cad2e1,
title = "Are deep neural networks really learning relevant features?",
abstract = "In recent years deep neural networks (DNNs) have become a popular choice for audio content analysis. This may be attributed to various factors including advancements in training algorithms, computational power, and the potential for DNNs to implicitly learn a set of feature detectors. We have recently re-examined two works \cite{sigtiaimproved}\cite{hamel2010learning} that consider DNNs for the task of music genre recognition (MGR). These papers conclude that frame-level features learned by DNNs offer an improvement over traditional, hand-crafted features such as Mel-frequency cepstrum coefficients (MFCCs). However, these conclusions were drawn based on training/testing using the GTZAN dataset, which is now known to contain several flaws including replicated observations and artists \cite{sturm2012analysis}. We illustrate how considering these flaws dramatically changes the results, which leads one to question the degree to which the learned frame-level features are actually useful for MGR. We make available a reproducible software package allowing other researchers to completely duplicate our figures and results.",
author = "Corey Kereliuk and Sturm, {Bob L.} and Jan Larsen",
year = "2015",
language = "English",
note = "Digital Music Research Network 9 ; Conference date: 16-12-2014 Through 16-12-2014",

}

Kereliuk, C, Sturm, BL & Larsen, J 2015, 'Are deep neural networks really learning relevant features?' Digital Music Research Network 9, London, United Kingdom, 16/12/2014 - 16/12/2014, .

Are deep neural networks really learning relevant features? / Kereliuk, Corey; Sturm, Bob L.; Larsen, Jan.

2015. Abstract from Digital Music Research Network 9, London, United Kingdom.

Research output: Contribution to conference without publisher/journalConference abstract for conferenceResearchpeer-review

TY - ABST

T1 - Are deep neural networks really learning relevant features?

AU - Kereliuk, Corey

AU - Sturm, Bob L.

AU - Larsen, Jan

PY - 2015

Y1 - 2015

N2 - In recent years deep neural networks (DNNs) have become a popular choice for audio content analysis. This may be attributed to various factors including advancements in training algorithms, computational power, and the potential for DNNs to implicitly learn a set of feature detectors. We have recently re-examined two works \cite{sigtiaimproved}\cite{hamel2010learning} that consider DNNs for the task of music genre recognition (MGR). These papers conclude that frame-level features learned by DNNs offer an improvement over traditional, hand-crafted features such as Mel-frequency cepstrum coefficients (MFCCs). However, these conclusions were drawn based on training/testing using the GTZAN dataset, which is now known to contain several flaws including replicated observations and artists \cite{sturm2012analysis}. We illustrate how considering these flaws dramatically changes the results, which leads one to question the degree to which the learned frame-level features are actually useful for MGR. We make available a reproducible software package allowing other researchers to completely duplicate our figures and results.

AB - In recent years deep neural networks (DNNs) have become a popular choice for audio content analysis. This may be attributed to various factors including advancements in training algorithms, computational power, and the potential for DNNs to implicitly learn a set of feature detectors. We have recently re-examined two works \cite{sigtiaimproved}\cite{hamel2010learning} that consider DNNs for the task of music genre recognition (MGR). These papers conclude that frame-level features learned by DNNs offer an improvement over traditional, hand-crafted features such as Mel-frequency cepstrum coefficients (MFCCs). However, these conclusions were drawn based on training/testing using the GTZAN dataset, which is now known to contain several flaws including replicated observations and artists \cite{sturm2012analysis}. We illustrate how considering these flaws dramatically changes the results, which leads one to question the degree to which the learned frame-level features are actually useful for MGR. We make available a reproducible software package allowing other researchers to completely duplicate our figures and results.

M3 - Conference abstract for conference

ER -

Kereliuk C, Sturm BL, Larsen J. Are deep neural networks really learning relevant features?. 2015. Abstract from Digital Music Research Network 9, London, United Kingdom.