Abstract
We present Face2Statistics, a comprehensive roadmap to deliver user-friendly, low-cost and effective alternatives for extracting drivers’ statistics. Face2Statistics is motivated by the growing importance of multi-modal statistics for Human-Vehicle Interaction, but existing approaches are user-unfriendly, impractical and cost-ineffective. To this end, we leverage Face2Statistics to build a series of Deep-Neural-Network-driven predictors of multi-modal statistics, by taking facial expressions as input only. We address two outstanding issues of the current design, and then (1) leverage HSV color space; and (2) Conditional Random Field to improve the robustness of Face2Statistics in terms of prediction accuracy and degree of customization. Our evaluations show that, Face2Statistics can be effective alternatives to sensors/monitors for Heart Rate, Skin Conductivity and Vehicle Speed. We also perform the breakdown analysis to justify the effectiveness of our optimizations. Both source codes and trained models of Face2Statistics are online at https://github.com/unnc-ucc/Face2Statistics.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
We use this functionality to investigate how we can enhance the robustness of Face2Statistics, as described in Sect. 4.
- 2.
- 3.
In Eq. 1, \(A_{\mu }\) is the association (observation matching) potential for modeling dependencies between the class label \(m_{\mu }\) and the set of all observations \(\mu \). \(x_{\mu }\) is the real-valued SVM response on the pixel (or node) \(\mu \). \(N_{\mu }\) is the neighborhoods of pixels \(\mu \) (a subset of the full spatial coordinate system S from above). \(I_{\mu \nu }\) is the interaction (local-consistency) potential for modeling dependencies between the levels of neighboring elements. Z is the partition function: a normalization coefficient (sums over possible labels).
References
Abbas, Q., Alsheddy, A.: A methodological review on prediction of multi-stage hypovigilance detection systems using multimodal features. IEEE Access 9, 47530–47564 (2021). https://doi.org/10.1109/ACCESS.2021.3068343
Asada, H.H., Shaltis, P., Reisner, A., Rhee, S., Hutchinson, R.C.: Mobile monitoring with wearable photoplethysmographic biosensors. IEEE Eng. Med. Biol. Mag. 22(3), 28–40 (2003)
Berk, T., Brownston, L., Kaufman, A.: A new color-namiing system for graphics languages. IEEE Ann. Hist. Comput. 2(03), 37–44 (1982)
Blignaut, P.J., Beelders, T.R.: Trackstick: a data quality measuring tool for tobii eye trackers. In: Morimoto, C.H., Istance, H.O., Spencer, S.N., Mulligan, J.B., Qvarfordt, P. (eds.) Proceedings of the 2012 Symposium on Eye-Tracking Research and Applications, ETRA 2012, Santa Barbara, CA, USA, 28–30 March 2012, pp. 293–296. ACM (2012). https://doi.org/10.1145/2168556.2168619
Bradski, G., Kaehler, A.: Learning OpenCV: Computer Vision with the OpenCV Library. O’Reilly Media, Inc., Sebastopol (2008)
Butakov, V.A., Ioannou, P.: Personalized driver/vehicle lane change models for adas. IEEE Trans. Veh. Technol. 64(10), 4422–4431 (2014)
Dao, D., et al.: A robust motion artifact detection algorithm for accurate detection of heart rates from photoplethysmographic signals using time-frequency spectral features. IEEE J. Biomed. Health Inform. 21(5), 1242–1253 (2016)
Duan, Y., Liu, J., Jin, W., Peng, X.: Characterizing differentially-private techniques in the era of internet-of-vehicles. Technical report-Feb-03 at User-Centric Computing Group, University of Nottingham Ningbo China (2022)
Erzin, E., Yemez, Y., Tekalp, A.M., Erçil, A., Erdogan, H., Abut, H.: Multimodal person recognition for human-vehicle interaction. IEEE Multimedia 13(2), 18–31 (2006)
Graves, A., Mohamed, A.R., Hinton, G.: Speech recognition with deep recurrent neural networks. In: 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 6645–6649. IEEE (2013)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7132–7141 (2018)
Huang, G., Liu, Z., Van Der Maaten, L., Weinberger, K.Q.: Densely connected convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4700–4708 (2017)
Huang, Z., et al.: Face2multi-modal: in-vehicle multi-modal predictors via facial expressions. In: 12th International Conference on Automotive User Interfaces and Interactive Vehicular Applications, pp. 30–33. AutomotiveUI 2020, Association for Computing Machinery, New York, NY, USA (2020). https://doi.org/10.1145/3409251.3411716
Jin, W., Duan, Y., Liu, J., Huang, S., Xiong, Z., Peng, X.: BROOK dataset: a playground for exploiting data-driven techniques in human-vehicle interactive designs. Technical report-Feb-01 at User-Centric Computing Group, University of Nottingham Ningbo China (2022)
Jin, W., Ming, X., Song, Z., Xiong, Z., Peng, X.: Towards emulating internet-of-vehicles on a single machine. In: AutomotiveUI 2021: 13th International Conference on Automotive User Interfaces and Interactive Vehicular Applications, Leeds, United Kingdom, 9–14 September 2021-Adjunct Proceedings, pp. 112–114. ACM (2021). https://doi.org/10.1145/3473682.3480275
Khodairy, M.A., Abosamra, G.: Driving behavior classification based on oversampled signals of smartphone embedded sensors using an optimized stacked-lstm neural networks. IEEE Access 9, 4957–4972 (2021)
Kortmann, F., et al.: Creating value from in-vehicle data: detecting road surfaces and road hazards. In: 23rd IEEE International Conference on Intelligent Transportation Systems, ITSC 2020, Rhodes, Greece, 20–23 September 2020, pp. 1–6. IEEE (2020). https://doi.org/10.1109/ITSC45102.2020.9294684
Kosov, S., Shirahama, K., Grzegorzek, M.: Labeling of partially occluded regions via the multi-layer crf. Multimed. Tools Appl. 78(2), 2551–2569 (2019)
Krizhevsky, A., Hinton, G.: Convolutional deep belief networks on cifar-10. Unpublished manuscript 40(7), 1–9 (2010)
Krizhevsky, A., Hinton, G., et al.: Learning multiple layers of features from tiny images (2009)
Liu, J., Jin, W., He, Z., Ming, X., Duan, Y., Xiong, Z., Peng, X.: HUT: enabling high-UTility, batched queries under differential privacy protection for internet-of-vehicles. Technical report-Feb-02 at User-Centric Computing Group, University of Nottingham Ningbo China (2022)
Martin, S., Tawari, A., Trivedi, M.M.: Balancing privacy and safety: protecting driver identity in naturalistic driving video data. In: Boyle, L.N., Burnett, G.E., Fröhlich, P., Iqbal, S.T., Miller, E., Wu, Y. (eds.) Proceedings of the 6th International Conference on Automotive User Interfaces and Interactive Vehicular Applications, Seattle, WA, USA, 17–19 September 2014, pp. 17:1–17:7. ACM (2014). https://doi.org/10.1145/2667317.2667325
Martin, S., Tawari, A., Trivedi, M.M.: Toward privacy-protecting safety systems for naturalistic driving videos. IEEE Trans. Intell. Transp. Syst. 15(4), 1811–1822 (2014)
Martinez, D.L., Rudovic, O., Picard, R.: Personalized automatic estimation of self-reported pain intensity from facial expressions. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 2318–2327. IEEE (2017)
Nishiuchi, H., Park, K., Hamada, S.: The relationship between driving behavior and the health condition of elderly drivers. Int. J. Intell. Transp. Syst. Res. 19(1), 264–272 (2021)
Omerustaoglu, F., Sakar, C.O., Kar, G.: Distracted driver detection by combining in-vehicle and image data using deep learning. Appl. Soft Comput. 96, 106657 (2020)
Peng, X., Huang, Z., Sun, X.: Building BROOK: a multi-modal and facial video database for human-vehicle interaction research, pp. 1–9 (2020). https://arxiv.org/abs/2005.08637
Porter, M.M., et al.: Older driver estimates of driving exposure compared to in-vehicle data in the candrive ii study. Traffic Inj. Prev. 16(1), 24–27 (2015)
Silva, N., et al.: Eye tracking support for visual analytics systems: foundations, current applications, and research challenges. In: Krejtz, K., Sharif, B. (eds.) Proceedings of the 11th ACM Symposium on Eye Tracking Research & Applications, ETRA 2019, Denver, CO, USA, 25–28 June 2019, pp. 11:1–11:10. ACM (2019). https://doi.org/10.1145/3314111.3319919
Song, Z., Wang, S., Kong, W., Peng, X., Sun, X.: First attempt to build realistic driving scenes using video-to-video synthesis in OpenDS framework. In: Adjunct Proceedings of the 11th International Conference on Automotive User Interfaces and Interactive Vehicular Applications, AutomotiveUI 2019, Utrecht, The Netherlands, 21–25 September 2019, pp. 387–391. ACM (2019). https://doi.org/10.1145/3349263.3351497
Song, Z., Duan, Y., Jin, W., Huang, S., Wang, S., Peng, X.: Omniverse-OpenDS: enabling agile developments for complex driving scenarios via reconfigurable abstractions. In: International Conference on Human-Computer Interaction (2022)
Sun, X., et al.: Exploring personalised autonomous vehicles to influence user trust. Cogn. Comput. 12(6), 1170–1186 (2020)
Tamura, T., Maeda, Y., Sekine, M., Yoshida, M.: Wearable photoplethysmographic sensors-past and present. Electronics 3(2), 282–302 (2014)
Toledo, T., Lotan, T.: In-vehicle data recorder for evaluation of driving behavior and safety. Transp. Res. Rec. 1953(1), 112–119 (2006)
Toledo, T., Musicant, O., Lotan, T.: In-vehicle data recorders for monitoring and feedback on drivers’ behavior. Transp. Res. Part C Emerg. Technol. 16(3), 320–331 (2008)
Wallach, H.M.: Conditional random fields: an introduction. Technical reports (CIS), p. 22 (2004)
Wang, J., Xiong, Z., Duan, Y., Liu, J., Song, Z., Peng, X.: The importance distribution of drivers’ facial expressions varies over time!, pp. 148–151. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3473682.3480283
Wang, S., Liu, J., Sun, H., Ming, X., Jin, W., Song, Z., Peng, X.: Oneiros-OpenDS: an interactive and extensible toolkit for agile and automated developments of complicated driving scenes. In: International Conference on Human-Computer Interaction (2022)
Xing, Y., Lv, C., Cao, D., Lu, C.: Energy oriented driving behavior analysis and personalized prediction of vehicle states with joint time series modeling. Appl. Energy 261, 114471 (2020)
Zhang, Y., Jin, W., Xiong, Z., Li, Z., Liu, Y., Peng, X.: Demystifying interactions between driving behaviors and styles through self-clustering algorithms. In: Krömker, H. (ed.) International Conference on Human-Computer Interaction (2021). https://doi.org/10.1007/978-3-030-78358-7_23
Zheng, S., et al.: Conditional random fields as recurrent neural networks. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1529–1537 (2015)
Acknowledgements
We thank anonymous reviewers in HCI’22 and AutomotiveUI’21 for their valuable feedback. We thank for all members of User-Centric Computing Group at University of Nottingham Ningbo China for the stimulating environment. An earlier version of this work is at [15].
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Xiong, Z. et al. (2022). Face2Statistics: User-Friendly, Low-Cost and Effective Alternative to In-vehicle Sensors/Monitors for Drivers. In: Krömker, H. (eds) HCI in Mobility, Transport, and Automotive Systems. HCII 2022. Lecture Notes in Computer Science, vol 13335. Springer, Cham. https://doi.org/10.1007/978-3-031-04987-3_20
Download citation
DOI: https://doi.org/10.1007/978-3-031-04987-3_20
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-04986-6
Online ISBN: 978-3-031-04987-3
eBook Packages: Computer ScienceComputer Science (R0)