Augmented Video Conferencing By Multimodal Emotion Recognition

Teachers: 
Denis Lalanne
Student: 
Samaneh Soleimani
Project status: 
Finished
Year: 
2013

In this research, we describe a Multimodal Collaborative Application, which is augmented with affective signals in order to react to the user’s behavior and emotional state in a video conferencing context. This system called Emotiboard, supports the fusion of two affective modalities obtained from speech signals and physiological changes, namely the skin galvanic. In this application, we also seek to provide the representation of affective responses that relate to aesthetic and usability impressions. In regard to this objective, various visualizations are provided in order to correspond to different usages and functionalities. Additionally, we have performed an early evaluation of the system, both from a technical perspective and from a user experience point of view. Post-hoc questionnaires were generally consistent with data from multimodal affective processing and users rated the overall experience as positive and useful.

From a usability point of view, more than half of the evaluators liked the system overall and believed in the usefulness of such a system. Most of them think this system is easy to use and requires no extra, in-depth knowledge and technical support to be able to use this software. But in contrast, they were not satisfied with the graphical representation of the emotions and also the level of consistency of the overall functionality of the system.

Regarding the automatic emotion recognition credibility and performance, we accomplished an experiment on SEMAINE-SAL (Sustained Emotionally colored Machine- human Interac- tion using Nonverbal Expression-Sensitive Artificial Listening) corpus which consists of emo- tional conversations. In this evaluation SVM with the SMO learning algorithm for classification is used, as it provides good generalization properties even for a large feature vector. The re- sults of this performance are also provided which as the first experiment does not lead to a good performance.