High-Level Analysis of Audio Features for Identifying Emotional Valence in Human Singing

Stuart Cunningham; Jonathan Weinel; Richard Picking

doi:10.1145/3243274.3243313

High-Level Analysis of Audio Features for Identifying Emotional Valence in Human Singing

Stuart Cunningham, Jonathan Weinel, Richard Picking

Research output: Contribution to book/anthology/report/conference proceeding › Article in proceeding › Research › peer-review

4 Citations (Scopus)

Abstract

Emotional analysis continues to be a topic that receives much attention in the audio and music community. The potential to link together human affective state and the emotional content or intention of musical audio has a variety of application areas in fields such as improving user experience of digital music libraries and music therapy. Less work has been directed into the emotional analysis of human acapella singing. Recently, the Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS) was released, which includes emotionally validated human singing samples. In this work, we apply established audio analysis features to determine if these can be used to detect underlying emotional valence in human singing. Results indicate that the short-term audio features of: energy; spectral centroid (mean); spectral centroid (spread); spectral entropy; spectral flux; spectral rolloff; and fundamental frequency can be useful predictors of emotion, although their efficacy is not consistent across positive and negative emotions.

Original language	English
Title of host publication	AM'18 Proceedings of the Audio Mostly 2018 on Sound in Immersion and Emotion
Number of pages	4
Publisher	Association for Computing Machinery
Publication date	2018
Article number	37
ISBN (Electronic)	978-1-4503-6609-0
DOIs	https://doi.org/10.1145/3243274.3243313
Publication status	Published - 2018
Event	Audio Mostly 2018: Sound in Immersion and Emotion - Wrexham Glyndwr University, Wrexham, United Kingdom Duration: 12 Sept 2018 → 14 Sept 2018 http://audiomostly.com/

Conference

Conference	Audio Mostly 2018
Location	Wrexham Glyndwr University
Country/Territory	United Kingdom
City	Wrexham
Period	12/09/2018 → 14/09/2018
Internet address	http://audiomostly.com/

Access to Document

10.1145/3243274.3243313

AUB Link

Search for the material in Aalborg University Library's search engine

Cite this

@inproceedings{6bb73b95a559470dbcc655c60f35ced3,

title = "High-Level Analysis of Audio Features for Identifying Emotional Valence in Human Singing",

abstract = "Emotional analysis continues to be a topic that receives much attention in the audio and music community. The potential to link together human affective state and the emotional content or intention of musical audio has a variety of application areas in fields such as improving user experience of digital music libraries and music therapy. Less work has been directed into the emotional analysis of human acapella singing. Recently, the Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS) was released, which includes emotionally validated human singing samples. In this work, we apply established audio analysis features to determine if these can be used to detect underlying emotional valence in human singing. Results indicate that the short-term audio features of: energy; spectral centroid (mean); spectral centroid (spread); spectral entropy; spectral flux; spectral rolloff; and fundamental frequency can be useful predictors of emotion, although their efficacy is not consistent across positive and negative emotions.",

author = "Stuart Cunningham and Jonathan Weinel and Richard Picking",

year = "2018",

doi = "10.1145/3243274.3243313",

language = "English",

booktitle = "AM'18 Proceedings of the Audio Mostly 2018 on Sound in Immersion and Emotion",

publisher = "Association for Computing Machinery",

address = "United States",

note = "Audio Mostly 2018 : Sound in Immersion and Emotion ; Conference date: 12-09-2018 Through 14-09-2018",

url = "http://audiomostly.com/",

}

High-Level Analysis of Audio Features for Identifying Emotional Valence in Human Singing. / Cunningham, Stuart; Weinel, Jonathan; Picking, Richard.
AM'18 Proceedings of the Audio Mostly 2018 on Sound in Immersion and Emotion. Association for Computing Machinery, 2018. 37.

Research output: Contribution to book/anthology/report/conference proceeding › Article in proceeding › Research › peer-review

TY - GEN

T1 - High-Level Analysis of Audio Features for Identifying Emotional Valence in Human Singing

AU - Cunningham, Stuart

AU - Weinel, Jonathan

AU - Picking, Richard

PY - 2018

Y1 - 2018

N2 - Emotional analysis continues to be a topic that receives much attention in the audio and music community. The potential to link together human affective state and the emotional content or intention of musical audio has a variety of application areas in fields such as improving user experience of digital music libraries and music therapy. Less work has been directed into the emotional analysis of human acapella singing. Recently, the Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS) was released, which includes emotionally validated human singing samples. In this work, we apply established audio analysis features to determine if these can be used to detect underlying emotional valence in human singing. Results indicate that the short-term audio features of: energy; spectral centroid (mean); spectral centroid (spread); spectral entropy; spectral flux; spectral rolloff; and fundamental frequency can be useful predictors of emotion, although their efficacy is not consistent across positive and negative emotions.

AB - Emotional analysis continues to be a topic that receives much attention in the audio and music community. The potential to link together human affective state and the emotional content or intention of musical audio has a variety of application areas in fields such as improving user experience of digital music libraries and music therapy. Less work has been directed into the emotional analysis of human acapella singing. Recently, the Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS) was released, which includes emotionally validated human singing samples. In this work, we apply established audio analysis features to determine if these can be used to detect underlying emotional valence in human singing. Results indicate that the short-term audio features of: energy; spectral centroid (mean); spectral centroid (spread); spectral entropy; spectral flux; spectral rolloff; and fundamental frequency can be useful predictors of emotion, although their efficacy is not consistent across positive and negative emotions.

U2 - 10.1145/3243274.3243313

DO - 10.1145/3243274.3243313

M3 - Article in proceeding

BT - AM'18 Proceedings of the Audio Mostly 2018 on Sound in Immersion and Emotion

PB - Association for Computing Machinery

T2 - Audio Mostly 2018

Y2 - 12 September 2018 through 14 September 2018

ER -

High-Level Analysis of Audio Features for Identifying Emotional Valence in Human Singing

Abstract

Conference

Access to Document

AUB Link

Fingerprint

Cite this