Using point-set compression to classify folk songs

Research output: Contribution to book/anthology/report/conference proceedingArticle in proceedingResearchpeer-review

706 Downloads (Pure)

Abstract

Thirteen different compression algorithms were used to calculate the normalized compression distances (NCDs) between pairs of tunes in the Annotated Corpus of 360 Dutch folk songs from the collection Onder de groene linde. These NCDs were then used in conjunction with the 1-nearest-neighbour algorithm and leave-one-out cross-validation to classify the 360 melodies into tune families. The classifications produced by the algorithms were compared with a ground-truth classification prepared by expert musicologists. Twelve of the thirteen compressors used in the experiment were based on the discovery of translational equivalence classes (TECs) of maximal translatable patterns (MTPs) in point-set representations of the melodies. The twelve algorithms consisted of four variants of each of three basic algorithms, COSIATEC, SIATECCompress and Forth’s algorithm. The main difference between these algorithms is that COSIATEC strictly partitions the input point set into TEC covered sets, whereas the TEC covered sets in the output of SIATECCompress and Forth’s algorithm may share points. The general-purpose compressor, bzip2, was used as a baseline against which the point-set compression algorithms were compared. The highest classification success rate of 77–84% was achieved by COSIATEC, followed by 60–64% for Forth’s algorithm and then 52–58% for SIATECCompress. When the NCDs were calculated using bzip2, the success rate was only 12.5%. The results demonstrate that the effectiveness of NCD for measuring similarity between folk-songs for classification purposes is highly dependent upon the
actual compressor chosen. Furthermore, it seems that compressors based on finding maximal repeated patterns in point-set representations of music show more promise for NCD-based music classification than general-purpose compressors designed for compressing text strings.
Original languageEnglish
Title of host publicationProceedings of the Fourth International Workshop on Folk Music Analysis (FMA2014)
EditorsAndre Holzapfel
Number of pages7
PublisherComputer Engineering Department, Bog ̆aziçi University
Publication date2014
Pages29-35
Publication statusPublished - 2014
EventInternational Workshop on Folk Music Analysis - Bogazici University, Istanbul, Turkey
Duration: 12 Jun 201413 Jun 2014
Conference number: 4

Conference

ConferenceInternational Workshop on Folk Music Analysis
Number4
LocationBogazici University
Country/TerritoryTurkey
CityIstanbul
Period12/06/201413/06/2014

Keywords

  • normalized compression distance
  • machine learning
  • music analysis
  • folk music

Fingerprint

Dive into the research topics of 'Using point-set compression to classify folk songs'. Together they form a unique fingerprint.

Cite this