Hostname: page-component-89b8bd64d-r6c6k Total loading time: 0 Render date: 2026-05-06T18:43:48.299Z Has data issue: false hasContentIssue false

Survey on audiovisual emotion recognition: databases, features, and data fusion strategies

Published online by Cambridge University Press:  11 November 2014

Chung-Hsien Wu*
Affiliation:
Department of Computer Science and Information Engineering, National Cheng Kung University, Tainan, Taiwan. Phone: +886 6 208 9349
Jen-Chun Lin
Affiliation:
Department of Computer Science and Information Engineering, National Cheng Kung University, Tainan, Taiwan. Phone: +886 6 208 9349 Institute of Information Science, Academia Sinica, Taipei, Taiwan
Wen-Li Wei
Affiliation:
Department of Computer Science and Information Engineering, National Cheng Kung University, Tainan, Taiwan. Phone: +886 6 208 9349
*
Corresponding author: Chung-Hsien Wu, chunghsienwu@gmail.com

Abstract

Emotion recognition is the ability to identify what people would think someone is feeling from moment to moment and understand the connection between his/her feelings and expressions. In today's world, human–computer interaction (HCI) interface undoubtedly plays an important role in our daily life. Toward harmonious HCI interface, automated analysis and recognition of human emotion has attracted increasing attention from the researchers in multidisciplinary research fields. In this paper, a survey on the theoretical and practical work offering new and broad views of the latest research in emotion recognition from bimodal information including facial and vocal expressions is provided. First, the currently available audiovisual emotion databases are described. Facial and vocal features and audiovisual bimodal data fusion methods for emotion recognition are then surveyed and discussed. Specifically, this survey also covers the recent emotion challenges in several conferences. Conclusions outline and address some of the existing emotion recognition issues.

Information

Type
Overview Paper
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/3.0/), which permits unrestricted re-use, distribution, and reproduction in any medium, provided the original work is properly cited.
Copyright
Copyright © The Authors, 2014
Figure 0

Table 1. Audiovisual databases for emotion recognition task.

Figure 1

Fig. 1. Examples for the posed expression with four emotional states: (a) Neutral, (b) Happy, (c) Angry, and (d) Sad.

Figure 2

Fig. 2. Valence-activation 2D emotion plane [69, 70].

Figure 3

Fig. 3. The distributions of (a) energy and (b) pitch (Hz) for four emotional states in the posed MHMC database; N denotes the total number of frames.

Figure 4

Table 2. Literature review on facial–vocal expression-based emotion recognition.

Figure 5

Table 3. Correlations among prosodic features and emotions [80].

Figure 6

Table 4. The example of 68 facial feature points extracted using AAM alignment and related facial animation parameters.

Figure 7

Fig. 4. Illustration of feature-level fusion strategy for audiovisual emotion recognition.

Figure 8

Fig. 5. Illustration of decision-level fusion strategy for audiovisual emotion recognition.

Figure 9

Fig. 6. Illustration of model-level fusion strategy for audiovisual emotion recognition.

Figure 10

Fig. 7. An example of the temporal phases of happy facial expression from onset, over apex to offset phase.

Figure 11

Fig. 8. An example illustrating model- and state-level alignments between audio and visual HMM sequences in the happy emotional state. The green and gray dotted lines represent the audio and visual HMM boundaries respectively and are used for model-level alignment estimation; the blue and red dotted lines represent the state boundaries under audio and visual HMMs respectively and are used for the state-level alignment estimation. The audio and image frames are represented by the numbered circles [50].

Supplementary material: PDF

Wu Supplementary Material

Supplementary Material

Download Wu Supplementary Material(PDF)
PDF 167.9 KB