Using point-set compression to classify folk songs

Publikation: Bidrag til bog/antologi/rapport/konference proceedingKonferenceartikel i proceedingForskningpeer review

676 Downloads (Pure)


Thirteen different compression algorithms were used to calculate the normalized compression distances (NCDs) between pairs of tunes in the Annotated Corpus of 360 Dutch folk songs from the collection Onder de groene linde. These NCDs were then used in conjunction with the 1-nearest-neighbour algorithm and leave-one-out cross-validation to classify the 360 melodies into tune families. The classifications produced by the algorithms were compared with a ground-truth classification prepared by expert musicologists. Twelve of the thirteen compressors used in the experiment were based on the discovery of translational equivalence classes (TECs) of maximal translatable patterns (MTPs) in point-set representations of the melodies. The twelve algorithms consisted of four variants of each of three basic algorithms, COSIATEC, SIATECCompress and Forth’s algorithm. The main difference between these algorithms is that COSIATEC strictly partitions the input point set into TEC covered sets, whereas the TEC covered sets in the output of SIATECCompress and Forth’s algorithm may share points. The general-purpose compressor, bzip2, was used as a baseline against which the point-set compression algorithms were compared. The highest classification success rate of 77–84% was achieved by COSIATEC, followed by 60–64% for Forth’s algorithm and then 52–58% for SIATECCompress. When the NCDs were calculated using bzip2, the success rate was only 12.5%. The results demonstrate that the effectiveness of NCD for measuring similarity between folk-songs for classification purposes is highly dependent upon the
actual compressor chosen. Furthermore, it seems that compressors based on finding maximal repeated patterns in point-set representations of music show more promise for NCD-based music classification than general-purpose compressors designed for compressing text strings.
TitelProceedings of the Fourth International Workshop on Folk Music Analysis (FMA2014)
RedaktørerAndre Holzapfel
Antal sider7
ForlagComputer Engineering Department, Bog ̆aziçi University
StatusUdgivet - 2014
BegivenhedInternational Workshop on Folk Music Analysis - Bogazici University, Istanbul, Tyrkiet
Varighed: 12 jun. 201413 jun. 2014
Konferencens nummer: 4


KonferenceInternational Workshop on Folk Music Analysis
LokationBogazici University


Dyk ned i forskningsemnerne om 'Using point-set compression to classify folk songs'. Sammen danner de et unikt fingeraftryk.