Hostname: page-component-89b8bd64d-72crv Total loading time: 0 Render date: 2026-05-13T15:52:30.652Z Has data issue: false hasContentIssue false

Subspace learning for facial expression recognition: an overview and a new perspective

Published online by Cambridge University Press:  14 January 2021

Cigdem Turan*
Affiliation:
Department of Electronic and Information Engineering, The Hong Kong Polytechnic University, Kowloon, Hong Kong The Department of Computer Science, Technical University of Darmstadt, Darmstadt, Germany
Rui Zhao
Affiliation:
Department of Electronic and Information Engineering, The Hong Kong Polytechnic University, Kowloon, Hong Kong
Kin-Man Lam
Affiliation:
Department of Electronic and Information Engineering, The Hong Kong Polytechnic University, Kowloon, Hong Kong
Xiangjian He
Affiliation:
Computer Science, School of Electrical and Data Engineering, University of Technology, Sydney, Australia
*
Corresponding author: C. Turan Email: cigdem.turan@connect.polyu.hk

Abstract

For image recognition, an extensive number of subspace-learning methods have been proposed to overcome the high-dimensionality problem of the features being used. In this paper, we first give an overview of the most popular and state-of-the-art subspace-learning methods, and then, a novel manifold-learning method, named soft locality preserving map (SLPM), is presented. SLPM aims to control the level of spread of the different classes, which is closely connected to the generalizability of the learned subspace. We also do an overview of the extension of manifold learning methods to deep learning by formulating the loss functions for training, and further reformulate SLPM into a soft locality preserving (SLP) loss. These loss functions are applied as an additional regularization to the learning of deep neural networks. We evaluate these subspace-learning methods, as well as their deep-learning extensions, on facial expression recognition. Experiments on four commonly used databases show that SLPM effectively reduces the dimensionality of the feature vectors and enhances the discriminative power of the extracted features. Moreover, experimental results also demonstrate that the learned deep features regularized by SLP acquire a better discriminability and generalizability for facial expression recognition.

Information

Type
Overview Paper
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution, and reproduction in any medium, provided the original work is properly cited.
Copyright
Copyright © The Author(s), 2021. Published by Cambridge University Press.
Figure 0

Table 1. List of mathematical notations, acronym and their corresponding descriptions

Figure 1

Table 2. Comparison of the within-class graph and the between-class graph for different subspace-learning methods

Figure 2

Table 3. Comparison of the objective functions used by different subspace methods

Figure 3

Table 4. Comparison of the different deep subspace-learning regularizers

Figure 4

Fig. 1. Spread of the respective expression manifolds when the value of $\beta$ increases from 1 to 1000: (1) anger, (2) disgust, (3) fear, (4) happiness, (5) sadness, and (6) surprise.

Figure 5

Table 5. Network architecture for learning discriminative features supervised by the soft locality preserving loss

Figure 6

Fig. 2. Representation of the FVs of happiness (HA) on the CK+ database, after SLPM: (a) HA, i.e. high-intensity expression samples are applied to SLPM, (b) HA+ low intensity FV with $\xi =0.9$, (c) HA+ low intensity FV with $\xi =0.7$, (d) HA+ generated FV with $\theta _{ne}=0.9$, and (e) HA+ generated FV with $\theta _{ne}=0.7$.

Figure 7

Fig. 3. Subspace learned using SLPM, with local descriptors “LP,” based on the dataset named CK+: (a) the mapped features extracted from high-intensity expression images and neutral face images, (b) the mapped features extracted from high-intensity and low-intensity ($\xi =0.7$) images, and (c) the mapped features extracted from high-intensity and low-intensity ($\xi =\{0.9, 0.8, 0.7, 0.6, 0.5, 0.4\}$) images.

Figure 8

Fig. 4. Representation of the sample-generation process based on (a) FVs extracted from high-intensity images and neutral-face images, and (b) FVs extracted from high-intensity images.

Figure 9

Table 6. Comparison of the number of images for different expression classes in the databases used in our experiments

Figure 10

Fig. 5. Recognition rates of the different subspace methods, with different local descriptors, based on a combined dataset of BAUM-2, CK+, JAFFE, and TFEID.

Figure 11

Fig. 6. Recognition rates of our proposed method in terms of different dimensions.

Figure 12

Table 7. Comparison (%) of recognition rates obtained by using low-intensity images with different $\xi$ values on the CK+ database, using the LPQ feature

Figure 13

Table 8. Comparison (%) of subspace learning methods on different datasets, with the LPQ descriptor being used with the nearest neighbor classifier

Figure 14

Table 9. Comparison (%) of subspace learning methods on different datasets, with the LPQ descriptor being used with the SVM classifier

Figure 15

Table 10. Comparison of the runtimes (in ms) required by the different subspace learning methods (MFA, SDM, and SLPM) on different datasets, with the LPQ descriptor used

Figure 16

Fig. 7. Visualization of the deeply learned features extracted from the samples of one testing fold in CK+, based on the different methods. (a) Softmax. (b) Center. (c) Island. (d) LP. (e) SLP.

Figure 17

Table 11. Comparison in terms of the recognition rates (%) of the deep subspace-learning methods on different datasets

Figure 18

Table 12. Generalization test (%) on the different deep subspace regularizers

Figure 19

Fig. 8. Sensitivity analysis on the hyperparameters in the proposed SLP loss. (a) Effect of $k$. (b) Effect of $\lambda$. (c) Effect of $\beta$.