Convolution-based classification of audio and symbolic representations of music

Research output: Contribution to journalJournal articleResearchpeer-review

Abstract

We present a novel convolution-based method for classification of audio and symbolic representations of music, which we apply to classification of music by style. Pieces of music are first sampled to pitch–time representations (piano-rolls or spectrograms) and then convolved with a Gaussian filter, before being classified by a support vector machine or by k-nearest neighbours in an ensemble of classifiers. On the well-studied task of discriminating between string quartet movements by Haydn and Mozart, we obtain accuracies that equal the state of the art on two data-sets. However, in multi-class composer identification, methods specialised for classifying symbolic representations of music are more effective. We also performed experiments on symbolic representations, synthetic audio and two different recordings of The Well-Tempered Clavier by J. S. Bach to study the method’s capacity to distinguish preludes from fugues. Our experimental results show that our approach performs similarly on symbolic representations, synthetic audio and audio recordings, setting our method apart from most previous studies that have been designed for use with either audio or symbolic data, but not both.
Close

Details

We present a novel convolution-based method for classification of audio and symbolic representations of music, which we apply to classification of music by style. Pieces of music are first sampled to pitch–time representations (piano-rolls or spectrograms) and then convolved with a Gaussian filter, before being classified by a support vector machine or by k-nearest neighbours in an ensemble of classifiers. On the well-studied task of discriminating between string quartet movements by Haydn and Mozart, we obtain accuracies that equal the state of the art on two data-sets. However, in multi-class composer identification, methods specialised for classifying symbolic representations of music are more effective. We also performed experiments on symbolic representations, synthetic audio and two different recordings of The Well-Tempered Clavier by J. S. Bach to study the method’s capacity to distinguish preludes from fugues. Our experimental results show that our approach performs similarly on symbolic representations, synthetic audio and audio recordings, setting our method apart from most previous studies that have been designed for use with either audio or symbolic data, but not both.
Original languageEnglish
JournalJournal of New Music Research
Volume47
Issue number3
Pages (from-to)191-205
Number of pages15
ISSN0929-8215
DOI
Publication statusE-pub ahead of print - 6 May 2018
Publication categoryResearch
Peer-reviewedYes

    Research areas

  • Music analysis, machine learning, convolution, composer recognition, genre recognition

Projects

ID: 273540966