Skip to main content
Log in

Dual integrated convolutional neural network for real-time facial expression recognition in the wild

  • Original article
  • Published:
The Visual Computer Aims and scope Submit manuscript

Abstract

Automatic recognition of facial expressions in the wild is a challenging problem and has drawn a lot of attention from the computer vision and pattern recognition community. Since their emergence, the deep learning techniques have proved their efficacy in facial expression recognition (FER) tasks. However, these techniques are parameter intensive, and thus, could not be deployed on resource-constrained embedded platforms for real-world applications. To mitigate these limitations of the deep learning inspired FER systems, in this paper, we present an efficient dual integrated convolution neural network (DICNN) model for the recognition of facial expressions in the wild in real-time, running on an embedded platform. The designed DICNN model with just 1.08M parameters and 5.40 MB memory storage size achieves optimal performance by maintaining a proper balance between recognition accuracy and computational efficiency. We evaluated the DICNN model on four FER benchmark datasets (FER2013, FERPlus, RAF-DB, and CKPlus) using different performance evaluation metrics, namely the recognition accuracy, precision, recall, and F1-score. Finally, to provide a portable solution with high throughput inference, we optimized the designed DICNN model using TensorRT SDK and deployed it on an Nvidia Xavier embedded platform. Comparative analysis results with the other state-of-the-art methods revealed the effectiveness of the designed FER system, which achieved competitive accuracy with multi-fold improvement in the execution speed.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15

Similar content being viewed by others

References

  1. Zhao, J., Mao, X., Chen, L.: Speech emotion recognition using deep 1D & 2D cnn lstm networks. Biomed. Signal Process. Control 47, 312–323 (2019)

    Article  Google Scholar 

  2. Hajarolasvadi, N., Demirel, H.: 3D cnn-based speech emotion recognition using k-means clustering and spectrograms. Entropy 21(5), 479 (2019)

    Article  Google Scholar 

  3. Xing, X., Li, Z., Xu, T., Shu, L., Hu, B., Xu, X.: Sae+ lstm: a new framework for emotion recognition from multi-channel eeg. Front. Neurorobot. 13, 37 (2019)

    Article  Google Scholar 

  4. Li, P., Liu, H., Si, Y., Li, C., Li, F., Zhu, X., Huang, X., Zeng, Y., Yao, D., Zhang, Y., et al.: Eeg based emotion recognition by combining functional connectivity network and local activations. IEEE Trans. Biomed. Eng. 66(10), 2869–2881 (2019)

    Article  Google Scholar 

  5. Uddin, M.Z., Hassan, M.M., Almogren, A., Alamri, A., Alrubaian, M., Fortino, G.: Facial expression recognition utilizing local direction-based robust features and deep belief network. IEEE Access 5, 4525–4536 (2017)

    Article  Google Scholar 

  6. Nguyen, H.D., Yeom, S., Lee, G.S., Yang, H.J., Na, I.S., Kim, S.H.: Facial emotion recognition using an ensemble of multi-level convolutional neural networks. Int. J. Pattern Recognit Artif. Intell. 33(11), 1940015 (2019)

    Article  Google Scholar 

  7. Avots, E., Sapiński, T., Bachmann, M., Kamińska, D.: Audiovisual emotion recognition in wild. Mach. Vis. Appl. 30(5), 975–985 (2019)

    Article  Google Scholar 

  8. Oh, S., Lee, J.Y., Kim, D.K.: The design of cnn architectures for optimal six basic emotion classification using multiple physiological signals. Sensors 20(3), 866 (2020)

    Article  Google Scholar 

  9. Ashwin, T., Guddeti, R.M.R.: Automatic detection of students’ affective states in classroom environment using hybrid convolutional neural networks. Educ. Inf. Technol. 4, 1–29 (2019)

    Google Scholar 

  10. Fei, Z., Yang, E., Li, D.D.U., Butler, S., Ijomah, W., Li, X., Zhou, H.: Deep convolution network based emotion analysis towards mental health care. Neurocomputing 3, 10 (2020)

    Google Scholar 

  11. Sonawane, B., Sharma, P.: Review of automated emotion-based quantification of facial expression in Parkinson’s patients. Vis. Comput. 7, 8 (2020)

    Google Scholar 

  12. Jeong, M., Ko, B.C.: Driver’s facial expression recognition in real-time for safe driving. Sensors 18(12), 4270 (2018)

    Article  Google Scholar 

  13. Zhang, Z., Luo, P., Loy, C.C., Tang, X.: Learning social relation traits from face images. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 3631–3639 (2015)

  14. Zhang, H., Huang, B., Tian, G.: Facial expression recognition based on deep convolution long short-term memory networks of double-channel weighted mixture. Pattern Recogn. Lett. 131, 128–134 (2020)

    Article  Google Scholar 

  15. Li, T.H.S., Kuo, P.H., Tsai, T.N., Luan, P.C.: Cnn and lstm based facial expression analysis model for a humanoid robot. IEEE Access 7, 93998–94011 (2019)

    Article  Google Scholar 

  16. An, F., Liu, Z.: Facial expression recognition algorithm based on parameter adaptive initialization of cnn and lstm. Vis. Comput 36(3), 483–498 (2020)

    Article  Google Scholar 

  17. Zhao, J., Mao, X., Zhang, J.: Learning deep facial expression features from image and optical flow sequences using 3D cnn. Vis. Comput. 34(10), 1461–1475 (2018)

    Article  Google Scholar 

  18. Pan, X., Zhang, S., Guo, W., Zhao, X., Chuang, Y., Chen, Y., Zhang, H.: Video-based facial expression recognition using deep temporal-spatial networks. IETE Tech. Rev. 3, 1–8 (2019)

    Google Scholar 

  19. Pan, X., Guo, W., Guo, X., Li, W., Xu, J., Wu, J.: Deep temporal-spatial aggregation for video-based facial expression recognition. Symmetry 11(1), 52 (2019)

    Article  Google Scholar 

  20. Zhang, S., Pan, X., Cui, Y., Zhao, X., Liu, L.: Learning affective video features for facial expression recognition via hybrid deep learning. IEEE Access 7, 32297–32304 (2019)

    Article  Google Scholar 

  21. Barsoum, E., Zhang, C., Ferrer, C.C., Zhang, Z.: Training deep networks for facial expression recognition with crowd-sourced label distribution. In: Proceedings of the 18th ACM International Conference on Multimodal Interaction, pp. 279–283 (2016)

  22. Huang, C.: Combining convolutional neural networks for emotion recognition. In: 2017 IEEE MIT Undergraduate Research Technology Conference (URTC), pp. 1–4. IEEE (2017)

  23. Albanie, S., Nagrani, A., Vedaldi, A., Zisserman, A.: Emotion recognition in speech using cross-modal transfer in the wild. In: Proceedings of the 26th ACM international conference on Multimedia, pp. 292–301 (2018)

  24. Kim, B.K., Roh, J., Dong, S.Y., Lee, S.Y.: Hierarchical committee of deep convolutional neural networks for robust facial expression recognition. J. Multimodal User Interfaces 10(2), 173–189 (2016)

    Article  Google Scholar 

  25. Shao, J., Qian, Y.: Three convolutional neural network models for facial expression recognition in the wild. Neurocomputing 355, 82–92 (2019)

    Article  Google Scholar 

  26. Yang, B., Cao, J., Ni, R., Zhang, Y.: Facial expression recognition using weighted mixture deep neural network based on double-channel facial images. IEEE Access 6, 4630–4640 (2017)

    Article  Google Scholar 

  27. Jain, D.K., Shamsolmoali, P., Sehdev, P.: Extended deep neural network for facial emotion recognition. Pattern Recogn. Lett. 120, 69–74 (2019)

    Article  Google Scholar 

  28. Xie, S., Hu, H., Wu, Y.: Deep multi-path convolutional neural network joint with salient region attention for facial expression recognition. Pattern Recogn. 92, 177–191 (2019)

    Article  Google Scholar 

  29. Liu, X., Zhou, F.: Improved curriculum learning using ssm for facial expression recognition. Vis. Comput. 3, 1–15 (2019)

    Google Scholar 

  30. Agrawal, A., Mittal, N.: Using cnn for facial expression recognition: a study of the effects of kernel size and number of filters on accuracy. Vis. Comput. 36(2), 405–412 (2020)

    Article  Google Scholar 

  31. Georgescu, M.I., Ionescu, R.T., Popescu, M.: Local learning with deep and handcrafted features for facial expression recognition. IEEE Access 7, 64827–64836 (2019)

    Article  Google Scholar 

  32. Li, Y., Zeng, J., Shan, S., Chen, X.: Occlusion aware facial expression recognition using cnn with attention mechanism. IEEE Trans. Image Process. 28(5), 2439–2450 (2018)

    Article  MathSciNet  Google Scholar 

  33. Wang, K., Peng, X., Yang, J., Lu, S., Qiao, Y.: Suppressing uncertainties for large-scale facial expression recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6897–6906 (2020)

  34. Wang, K., Peng, X., Yang, J., Meng, D., Qiao, Y.: Region attention networks for pose and occlusion robust facial expression recognition. IEEE Trans. Image Process. 29, 4057–4069 (2020)

    Article  Google Scholar 

  35. Lopes, A.T., de Aguiar, E., De Souza, A.F., Oliveira-Santos, T.: Facial expression recognition with convolutional neural networks: coping with few data and the training sample order. Pattern Recogn. 61, 610–628 (2017)

    Article  Google Scholar 

  36. Li, K., Jin, Y., Akram, M.W., Han, R., Chen, J.: Facial expression recognition with convolutional neural networks via a new face cropping and rotation strategy. Vis. Comput. 36(2), 391–404 (2020)

    Article  Google Scholar 

  37. Miao, S., Xu, H., Han, Z., Zhu, Y.: Recognizing facial expressions using a shallow convolutional neural network. IEEE Access 7, 78000–78011 (2019)

    Article  Google Scholar 

  38. Riaz, M.N., Shen, Y., Sohail, M., Guo, M.: Exnet: an efficient approach for emotion recognition in the wild. Sensors 20(4), 1087 (2020)

    Article  Google Scholar 

  39. Gogić, I., Manhart, M., Pandžić, I.S., Ahlberg, J.: Fast facial expression recognition using local binary features and shallow neural networks. Vis.Comput. 36(1), 97–112 (2020)

    Article  Google Scholar 

  40. Zhao, G., Yang, H., Yu, M.: Expression recognition method based on a lightweight convolutional neural network. IEEE Access 8, 38528–38537 (2020)

    Article  Google Scholar 

  41. Pramerdorfer, C., Kampel, M.: Facial expression recognition using convolutional neural networks: state of the art. arXiv preprint arXiv:1612.02903 (2016)

  42. Li, S., Deng, W.: Deep facial expression recognition: a survey. IEEE Trans. Affective Comput. 3, 91 (2020)

    Google Scholar 

  43. Huang, J., Rathod, V., Sun, C., Zhu, M., Korattikara, A., Fathi, A., Fischer, I., Wojna, Z., Song, Y., Guadarrama, S., et al.: Speed/accuracy trade-offs for modern convolutional object detectors. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7310–7311 (2017)

  44. King, D.E.: Dlib-ml: a machine learning toolkit. J. Mach. Learn. Res. 10, 1755–1758 (2009)

    Google Scholar 

  45. Kazemi, V., Sullivan, J.: One millisecond face alignment with an ensemble of regression trees. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1867–1874 (2014)

  46. Kotikalapudi, R., contributors: keras-vis. https://github.com/raghakot/keras-vis (2017)

  47. Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)

  48. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)

  49. Carrier, P.L., Courville, A., Goodfellow, I.J., Mirza, M., Bengio, Y.: Fer-2013 face database. Universit de Montral (2013)

  50. Li, S., Deng, W.: Reliable crowdsourcing and deep locality-preserving learning for unconstrained facial expression recognition. IEEE Trans. Image Process. 28(1), 356–370 (2018)

    Article  MathSciNet  MATH  Google Scholar 

  51. Lucey, P., Cohn, J.F., Kanade, T., Saragih, J., Ambadar, Z., Matthews, I.: The extended cohn-kanade dataset (ck+): a complete dataset for action unit and emotion-specified expression. In: 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition-Workshops, pp. 94–101. IEEE (2010)

  52. Lian, Z., Li, Y., Tao, J., Huang, J., Niu, M.: Region based robust facial expression analysis. In: 2018 First Asian Conference on Affective Computing and Intelligent Interaction (ACII Asia), pp. 1–5. IEEE (2018)

  53. Li, M., Xu, H., Huang, X., Song, Z., Liu, X., Li, X.: Facial expression recognition with identity and emotion joint learning. IEEE Trans. Affect. Comput. 2, 71 (2018)

    Google Scholar 

  54. Dinelli, G., Meoni, G., Rapuano, E., Benelli, G., Fanucci, L.: An fpga-based hardware accelerator for cnns using on-chip memories only: Design and benchmarking with intel movidius neural compute stick. Int. J. Reconf. Comput. 1, 84 (2019)

    Google Scholar 

  55. Choudhary, T., Mishra, V., Goswami, A., Sarangapani, J.: A comprehensive survey on model compression and acceleration. Artif. Intell. Rev. 4, 1–43 (2020)

    Google Scholar 

  56. Hinton, G., Vinyals, O., Dean, J.: Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531 (2015)

  57. Sze, V., Chen, Y.H., Yang, T.J., Emer, J.S.: Efficient processing of deep neural networks: a tutorial and survey. Proc. IEEE 105(12), 2295–2329 (2017)

    Article  Google Scholar 

  58. Gordon, A., Eban, E., Nachum, O., Chen, B., Wu, H., Yang, T.J., Choi, E.: Morphnet: Fast & simple resource-constrained structure learning of deep networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1586–1595 (2018)

  59. Ditty, M., Karandikar, A., Reed, D.: Nvidia’s xavier soc. In: Hot Chips: A Symposium on High Performance Chips (2018)

  60. Migacz, S.: 8-bit inference with tensorrt. In: GPU Technology Conference, vol. 2, p. 5 (2017)

  61. Li, H., Wen, G.: Sample awareness-based personalized facial expression recognition. Appl. Intell. 49(8), 2956–2969 (2019)

    Article  Google Scholar 

Download references

Acknowledgements

The authors would like to thank the director, CSIR-CEERI, Pilani, for supporting the research activities at CSIR-CEERI, Pilani.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sumeet Saurav.

Ethics declarations

Conflicts of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Saurav, S., Gidde, P., Saini, R. et al. Dual integrated convolutional neural network for real-time facial expression recognition in the wild. Vis Comput 38, 1083–1096 (2022). https://doi.org/10.1007/s00371-021-02069-7

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00371-021-02069-7

Keywords

Navigation