ABSTRACT
The Emotion Recognition in the Wild (EmotiW) Challenge has been held for three years. Previous winner teams primarily focus on designing specific deep neural networks or fusing diverse hand-crafted and deep convolutional features. They all neglect to explore the significance of the latent relations among changing features resulted from facial muscle motions. In this paper, we study this recognition challenge from the perspective of analyzing the relations among expression-specific facial features in an explicit manner. Our method has three key components. First, we propose a pair-wise learning strategy to automatically seek a set of facial image patches which are important for discriminating two particular emotion categories. We found these learnt local patches are in part consistent with the locations of expression-specific Action Units (AUs), thus the features extracted from such kind of facial patches are named AU-aware facial features. Second, in each pair-wise task, we use an undirected graph structure, which takes learnt facial patches as individual vertices, to encode feature relations between any two learnt facial patches. Finally, a robust emotion representation is constructed by concatenating all task-specific graph-structured facial feature relations sequentially. Extensive experiments on the EmotiW 2015 Challenge testify the efficacy of the proposed approach. Without using additional data, our final submissions achieved competitive results on both sub-challenges including the image based static facial expression recognition (we got 55.38% recognition accuracy outperforming the baseline 39.13% with a margin of 16.25%) and the audio-video based emotion recognition (we got 53.80% recognition accuracy outperforming the baseline 39.33% and the 2014 winner team's final result 50.37% with the margins of 14.47% and 3.43%, respectively).
- R. Banse and K.R. Scherer. Acoustic profiles in vocal emotion expression. Journal of personality and social psychology, 70(3):614--636, 1996.Google Scholar
- M.S. Bartlett, G. Littlewort, M. Frank, C. Lainscsek, I. Fasel, and J. Movellan. Recognizing facial expression: machine learning and application to spontaneous behavior. In CVPR. IEEE, 2005. Google ScholarDigital Library
- X. Cao, Y. Wei, F. Wen, and J. Sun. Face alignment by explicit shape regression. In CVPR. IEEE, 2012. Google ScholarDigital Library
- C.C. Chang and C.J. Lin. Libsvm: a library for support vector machines. ACM Transactions on Intelligent Systems and Technology (TIST), 2(3):27, 2011. Google ScholarDigital Library
- W. Chen, M.J. Er, and S. Wu. Illumination compensation and normalization for robust face recognition using discrete cosine transform in logarithm domain. Systems, Man, and Cybernetics, Part B: Cybernetics, IEEE Transactions on, 36(2):458--466, 2006. Google ScholarDigital Library
- T.F. Cootes, G.J. Edwards, and C.J. Taylor. Active appearance models. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 23(6):681--685, 2001. Google ScholarDigital Library
- N. Dalal and B. Triggs. Histograms of oriented gradients for human detection. In CVPR. IEEE, 2005. Google ScholarDigital Library
- A. Dhall, R. Goecke, J. Joshi, K. Sikka, and T. Gedeon. Emotion recognition in the wild challenge 2014: baseline, data and protocol. In ACM ICMI. ACM, 2014. Google ScholarDigital Library
- A. Dhall, R. Goecke, S. Lucey, and T. Gedeon. Collecting large, richly annotated facial-expression databases from movies. IEEE Multimedia, 2012. Google ScholarDigital Library
- A. Dhall, O.V.R. Murthy, R. Goecke, J. Joshi, and T. Gedeon. Video and image based emotion recognition challenges in the wild: EmotiW 2015. In ACM ICMI. ACM, 2015. Google ScholarDigital Library
- P. Ekman and W. Friesen. Facial action coding system: a technique for the measurement of facial movement. Consulting Psychologists Press Inc., San Francisco, CA, 1978.Google Scholar
- T. Hassner, S. Harel, E. Paz, and R. Enbar. Effective face frontalization in unconstrained images. In CVPR. IEEE, 2015.Google ScholarCross Ref
- G. Hinton, L. Deng, et al. Deep neural networks for acoustic modeling in speech recognition: the shared views of four research groups. Signal Processing Magazine, IEEE, 29(6):82--97, 2012.Google ScholarCross Ref
- Y. Jia, E. Shelhamer, et al. Caffe: convolutional architecture for fast feature embedding. In ACM MM. ACM, 2014. Google ScholarDigital Library
- S.E. Kahou, C. Pal, et al. Combining modality specific deep neural network models for emotion recognition in video. In ACM ICMI. ACM, 2013. Google ScholarDigital Library
- T. Kanade, J. Cohn, and Y. Tian. Comprehensive database for facial expression analysis. In FG. IEEE, 2000. Google ScholarDigital Library
- A. Krizhevsky, I. Sutskever, and G. Hinton. Imagenet classification with deep convolutional neural networks. In NIPS, 2012.Google ScholarDigital Library
- M. Liu, S. Li, S. Shan, and X. Chen. Au-aware deep networks for facial expression recognition. In FG. IEEE, 2013.Google Scholar
- M. Liu, R. Wang, S. Li, S. Shan, Z. Huang and X. Chen, Combining multiple kernel methods on Riemannian manifold for emotion recognition in the wild. In ACM ICMI. ACM, 2014. Google ScholarDigital Library
- P. Liu, J.T. Zhou, I.W.H. Tsang, Z. Meng, S. Han, and Y. Tong. Feature disentangling machine a novel approach of feature selection and disentangling in facial expression analysis. In ECCV. Springer, 2014.Google ScholarCross Ref
- D. G. Lowe. Distinctive image features from scale-invariant keypoints. International journal of computer vision, 60(2):91 -110, 2004. Google ScholarDigital Library
- P. Lucey, J.F. Cohn, T. Kanade, J. Saragih, Z. Ambadar, and I. Matthews. The Extended Cohn-Kanade Dataset (CK+): a complete dataset for action unit and emotion-specified expression. In CVPR Workshops. IEEE, 2010.Google ScholarCross Ref
- M. Pantic, M.F. Valstar, R. Rademaker, and L. Maat. Webbased database for facial expression analysis. In ACM MM. ACM, 2005.Google Scholar
- M. Valstar, B. Jiang, M. Mehu, M. Pantic, and K. Scherer. The first facial expression recognition and analysis challenge. In FG. IEEE, 2011.Google ScholarCross Ref
- M. Valstar and M. Pantic. Induced disgust, happiness and surprise: an addition to the MMI facial expression database. In LREC Workshops, 2010.Google Scholar
- M. Valstar, B. Schuller, K. Smith, F. Eyben, B. Jiang, S. Bilakhia, S. Schnieder, R. Cowie, and M. Pantic. Avec 2013: the continuous audio/visual emotion and depression recognition challenge. In ACM Workshop on AVEC. ACM, 2013. Google ScholarDigital Library
- P. Viola and M. Jones. Rapid object detection using a boosted cascade of simple features. In CVPR. IEEE, 2001.Google ScholarCross Ref
- Z. Wang, S. Wang, and Q. Ji. Capturing complex spatiotemporal relations among facial muscles for facial expression recognition. In CVPR. IEEE, 2013. Google ScholarDigital Library
- X. Xiong and F. de la Torre. Supervised descent method and its applications to face alignment. In CVPR. IEEE, 2013. Google ScholarDigital Library
- Z. Zeng, M. Pantic, G. Roisman, and T. S. Huang. A survey of affect recognition methods: audio, visual, and spontaneous expressions. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 31(1):39--58, 2009. Google ScholarDigital Library
- G. Zhao and M. Pietikainen. Dynamic texture recognition using local binary patterns with an application to facial expressions. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 29(6):915--928, 2007. Google ScholarDigital Library
- L. Zhong, Q. Liu, P. Yang, B. Liu, J. Huang, and D.N. Metaxas. Learning active facial patches for expression analysis. In CVPR. IEEE, 2012.Google Scholar
Index Terms
- Capturing AU-Aware Facial Features and Their Latent Relations for Emotion Recognition in the Wild
Recommendations
Adaptive facial point detection and emotion recognition for a humanoid robot
We propose a robust landmark detector to deal with pose variation and occlusions.SVRs and NNs are respectively used to estimate intensities of 18 selected AUs.Fuzzy c-means clustering is employed to detect seven basic and compound emotions.Our ...
Expression-invariant face recognition by facial expression transformations
In this paper, we present a method of expression-invariant face recognition that transforms input face image with an arbitrary expression into its corresponding neutral facial expression image. When a new face image with an arbitrary expression is ...
Emotion Recognition with Facial Landmark Heatmaps
MultiMedia ModelingAbstractFacial expression recognition is a very challenging problem and has attracted more and more researchers’ attention. In this paper, considering that facial expression recognition is closely related to the features of key facial regions, we propose ...
Comments