Changes in Facial Expression as Biometric: A Database and Benchmarks of Identification

Research output: Contribution to book/anthology/report/conference proceedingArticle in proceedingResearchpeer-review

Abstract

Facial dynamics can be considered as unique signatures for discrimination between people. These have started to become important topic since many devices have the possibility of unlocking using face recognition or verification. In this work, we evaluate the efficacy of the transition frames of video in emotion as compared to the peak emotion frames for identification. For experiments with transition frames we extract features from each frame of the video from a fine-tuned VGG-Face Convolutional Neural Network (CNN) and geometric features from facial landmark points. To model the temporal context of the transition frames we train a Long-Short Term Memory (LSTM) on the geometric and the CNN features. Furthermore, we employ two fusion strategies: first, an early fusion, in which the geometric and the CNN features are stacked and fed to the LSTM. Second, a late fusion, in which the prediction of the LSTMs, trained
independently on the two features, are stacked and used with a Support Vector Machine (SVM). Experimental results show that the late fusion strategy gives the best results and the transition frames give better identification results as compared to the peak emotion frames.
Close

Details

Facial dynamics can be considered as unique signatures for discrimination between people. These have started to become important topic since many devices have the possibility of unlocking using face recognition or verification. In this work, we evaluate the efficacy of the transition frames of video in emotion as compared to the peak emotion frames for identification. For experiments with transition frames we extract features from each frame of the video from a fine-tuned VGG-Face Convolutional Neural Network (CNN) and geometric features from facial landmark points. To model the temporal context of the transition frames we train a Long-Short Term Memory (LSTM) on the geometric and the CNN features. Furthermore, we employ two fusion strategies: first, an early fusion, in which the geometric and the CNN features are stacked and fed to the LSTM. Second, a late fusion, in which the prediction of the LSTMs, trained
independently on the two features, are stacked and used with a Support Vector Machine (SVM). Experimental results show that the late fusion strategy gives the best results and the transition frames give better identification results as compared to the peak emotion frames.
Original languageEnglish
Title of host publicationProc. of the 13th IEEE Conf. on Automatic Face and Gesture Recognition Workshop
PublisherIEEE
Publication date2 Mar 2018
Publication statusAccepted/In press - 2 Mar 2018
Publication categoryResearch
Peer-reviewedYes
EventIEEE Conf. on Automatic Face and Gesture Recognition Workshops - X'ian, China
Duration: 15 May 201819 May 2018
https://fg2018.cse.sc.edu

Conference

ConferenceIEEE Conf. on Automatic Face and Gesture Recognition Workshops
LandChina
ByX'ian
Periode15/05/201819/05/2018
Internetadresse

    Research areas

  • Facial expression, biometric, database, benchmark, Deep Learning, CNN, LSTM, Multimodal, Spatio-temporal, SVM

Map

Download statistics

No data available
ID: 271305265