Towards Statistical Unsupervised Online Learning for Music Listening with Hearing Devices

Hendrik Purwins, Marco Marchini, Richard Marxer

Research output: Contribution to conference without publisher/journalPaper without publisher/journalResearchpeer-review

97 Downloads (Pure)

Abstract

The same piece of music maybe be experienced differently by different subjects, depending on various factors: The perceptual processing of the piece (e.g. 3500 inner hair cells vs. a few electrodes in a cochlear implant) may allow for a representation with various degrees of frequency and temporal resolution. The listener’s cognitive capabilities, e.g. memory, voluntary attention direction, abstraction and inference (normal listener vs. a person with dementia) will cause different cognitive processing and musical experience. The familiarity with a particular musical style, instrument, piece may affect the emotional impact and intellectual understanding of a piece.
It could be desirable for a hearing device (hearing aid or cochlear implant) to perform musical signal processing to transfer the musical experience of a piece for a normal listener to a similar musical experience of the same piece in a listener with severe hearing limitations. Such a transformation would require a formal representation of the music piece that is scalable in complexity. Such a scalable representation is provided in [1], where the number of different sounds (e.g. considering open and closed hi-hats as one or two distinct sound categories) as well as the temporal context horizon (e.g. storing up to 2-note sequences or up to 10-note sequences) is adaptable. The framework in [1] is based on two cognitively plausible principles: unsupervised learning and statistical learning. Opposed to supervised learning in primary school children, where the school teacher points at a written letter and articulates its phonetic pronunciation, infants perceptually organise the phonemes they are exposed to into groups, based on the phoneme’s similarity and context (unsupervised learning). 8-month-old infants are able to learn statistical relationships between neighboring speech sounds [3] (statistical learning). In [1], Figure 1, unsupervised learning is implemented as agglomerative clustering, informed by the Gestalt principle of regularity. The model [1] performs statistical learning, applying variable length Markov chains. In [2], grouping of sounds into phonetic/instrument categories and learning of instrument event sequences is performed jointly using a Hierarchical Dirichlet Process Hidden Markov Model.
Whereas machines often learn by processing a large data base and subsequently updating parameters of the algorithm, humans learn instantaneously, i.e. the mental representation is continuously changed after every exposure to small batches of sound events (online learning). In [2] online learning is implemented via the interplay of Cobweb clustering and a hierarchical n-gram instantaneously updating the number of timbre groups and their respective transition counts. We propose to use online learning for the co-evolution of both CI user and machine in (re-)learning musical language.
[1] Marco Marchini and Hendrik Purwins. Unsupervised analysis and generation of audio percussion sequences. In International Symposium on Computer Music Modeling and Retrieval, pages 205–218. Springer, 2011.
[2] R. Marxer and H. Purwins. Unsupervised incremental online learning and prediction of musical audio signals. IEEE Transactions on Audio, Speech, and Language Processing, 24(5):863–74, 2016.
[3] J. Saffran, R. Aslin, and E. Newport. Statistical learning by 8-month-old infants. Science, 274(5294):1926–1928, 1996.
Original languageEnglish
Publication date2017
Publication statusPublished - 2017
EventThe Music & Cochlear Implants Symposium - Eriksholm Research Centre, Snekkersten, Denmark
Duration: 13 Oct 201614 Oct 2016
http://implantsandmusic.net
http://www.implantsandmusic.net/?p=start

Conference

ConferenceThe Music & Cochlear Implants Symposium
LocationEriksholm Research Centre
Country/TerritoryDenmark
CitySnekkersten
Period13/10/201614/10/2016
Internet address

Keywords

  • Cochlear implant
  • Machine Learning
  • Neural Networks

Fingerprint

Dive into the research topics of 'Towards Statistical Unsupervised Online Learning for Music Listening with Hearing Devices'. Together they form a unique fingerprint.

Cite this