Are deep neural networks really learning relevant features?

Corey Kereliuk, Bob L. Sturm, Jan Larsen

Publikation: Konferencebidrag uden forlag/tidsskriftKonferenceabstrakt til konferenceForskningpeer review


In recent years deep neural networks (DNNs) have become a popular choice for audio content analysis. This may be attributed to various factors including advancements in training algorithms, computational power, and the potential for DNNs to implicitly learn a set of feature detectors. We have recently re-examined two works \cite{sigtiaimproved}\cite{hamel2010learning} that consider DNNs for the task of music genre recognition (MGR). These papers conclude that frame-level features learned by DNNs offer an improvement over traditional, hand-crafted features such as Mel-frequency cepstrum coefficients (MFCCs). However, these conclusions were drawn based on training/testing using the GTZAN dataset, which is now known to contain several flaws including replicated observations and artists \cite{sturm2012analysis}. We illustrate how considering these flaws dramatically changes the results, which leads one to question the degree to which the learned frame-level features are actually useful for MGR. We make available a reproducible software package allowing other researchers to completely duplicate our figures and results.
StatusUdgivet - 2015
BegivenhedDigital Music Research Network 9 - Queen Mary University of London, London, Storbritannien
Varighed: 16 dec. 201416 dec. 2014


WorkshopDigital Music Research Network 9
LokationQueen Mary University of London


Dyk ned i forskningsemnerne om 'Are deep neural networks really learning relevant features?'. Sammen danner de et unikt fingeraftryk.
  • CoSound

    Christensen, M. G., Tan, Z., Jensen, S. H. & Sturm, B. L.


    Projekter: ProjektForskning