Multimodal Sentiment and Personality Perception Under Speech: A Comparison of Transformer-based Architectures

Ádám Fodor, Rachid R. Saboundji, Julio C.S. Jacques Junior, Sergio Escalera, David Gallardo, Andras Lorincz

Research output: Contribution to journalConference article in JournalResearchpeer-review

2 Citations (Scopus)

Abstract

Human-machine, human-robot interaction, and collaboration appear in diverse fields, from homecare to Cyber-Physical Systems. Technological development is fast, whereas real-time methods for social communication analysis that can measure small changes in sentiment and personality states, including visual, acoustic and language modalities are lagging, particularly when the goal is to build robust, appearance invariant, and fair methods. We study and compare methods capable of fusing modalities while satisfying real-time and invariant appearance conditions. We compare state-of-the-art transformer architectures in sentiment estimation and introduce them in the much less explored field of personality perception. We show that the architectures perform differently on automatic sentiment and personality perception, suggesting that each task may be better captured/modeled by a particular method. Our work calls attention to the attractive properties of the linear versions of the transformer architectures. In particular, we show that the best results are achieved by fusing the different architectures’ preprocessing methods. However, they pose extreme conditions in computation power and energy consumption for real-time computations for quadratic transformers due to their memory requirements. In turn, linear transformers pave the way for quantifying small changes in sentiment estimation and personality perception for real-time social communications for machines and robots.

Original languageEnglish
Book seriesProceedings of Machine Learning Research
Volume173
Pages (from-to)218-241
Number of pages24
ISSN2640-3498
Publication statusPublished - 2021
EventChaLearn LAP Challenge on Understanding Social Behavior in Dyadic and Small Group Interactions Workshop, DYAD 2021, held in conjunction with the International Conference on Computer Vision, ICCV 2021 - Virtual, Online
Duration: 16 Oct 2021 → …

Conference

ConferenceChaLearn LAP Challenge on Understanding Social Behavior in Dyadic and Small Group Interactions Workshop, DYAD 2021, held in conjunction with the International Conference on Computer Vision, ICCV 2021
CityVirtual, Online
Period16/10/2021 → …

Bibliographical note

Funding Information:
This work has been partially supported by EU H2020 project Humane AI Net (grant agreement No. 952026), by Spanish project PID2019-105093GB-I00 and by ICREA under the ICREA Academia programme.

Publisher Copyright:
© 2022 Fodor, R.R. Saboundji, J.C.S.J. Junior, S. Escalera, D. Gallardo & A. Lorincz.

Keywords

  • Fairness
  • Linear Transformers
  • Multimodal information fusion
  • Personality perception
  • Sentiment analysis

Fingerprint

Dive into the research topics of 'Multimodal Sentiment and Personality Perception Under Speech: A Comparison of Transformer-based Architectures'. Together they form a unique fingerprint.

Cite this