Abstract
Wearable cameras can gather large amounts of image data that provide rich visual information about the daily activities of the wearer. Motivated by the large number of health applications that could be enabled by the automatic recognition of daily activities, such as lifestyle characterization for habit improvement, context-aware personal assistance and tele-rehabilitation services, we propose a system to classify 21 daily activities from photo-streams acquired by a wearable photo-camera. Our approach combines the advantages of a late fusion ensemble strategy relying on convolutional neural networks at image level with the ability of recurrent neural networks to account for the temporal evolution of high-level features in photo-streams without relying on event boundaries. The proposed batch-based approach achieved an overall accuracy of 89.85%, outperforming state-of-the-art end-to-end methodologies. These results were achieved on a dataset consists of 44,902 egocentric pictures from three persons captured during 26 days in average.
Similar content being viewed by others
Notes
The annotations are publicly available at https://www.github.com/gorayni/egocentric_photostreams.
References
Bolaños M, Dimiccoli M, Radeva P (2017) Toward storytelling from visual lifelogging: an overview. IEEE Trans Hum Mach Syst 47(1):77–90
Breiman L, Friedman J, Stone CJ, Olshen RA (1984) Classification and regression trees. CRC Press, Boca Raton
Cartas A, Dimiccoli M, Radeva P (2017) Batch-based activity recognition from egocentric photo-streams. In: Proceedings of the international conference on computer vision (ICCV), workshop on egocentric perception, interaction and computing. IEEE
Cartas A, Marín J, Radeva P, Dimiccoli M (2017) Recognizing activities of daily living from egocentric images. In: Proceedings of the Iberian conference on pattern recognition and image analysis (IbPRIA). Springer, Cham, pp 87–95
Castro D, Hickson S, Bettadapura V, Thomaz E, Abowd G, Christensen H, Essa I (2015) Predicting daily activities from egocentric images using deep learning. In: Proceedings of the 2015 ACM international symposium on wearable computers. ACM, pp 75–82
Chollet F et al (2015) Keras. https://github.com/fchollet/keras
Dimiccoli M, Bolaños M, Talavera E, Aghaei M, Nikolov SG, Radeva P (2016) Sr-clustering: semantic regularized clustering for egocentric photo streams segmentation. Comput Vis Image Underst 155:55–69
Donahue J, Hendricks LA, Guadarrama S, Rohrbach M, Venugopalan S, Saenko K, Darrell T (2015) Long-term recurrent convolutional networks for visual recognition and description. In: CVPR
Fathi A, Farhadi A, Rehg JM (2011) Understanding egocentric activities. In: 2011 international conference on computer vision. IEEE, pp 407–414
Fathi A, Li Y, Rehg JM (2012) Learning to recognize daily actions using gaze. In: European conference on computer vision. Springer, pp 314–327
Gurrin C, Joho H, Hopfgartner F, Zhou L, Albatal R (2017) Overview of NTCIR-12 lifelog task. In: NTCIR-12. National Institute of Informatics (NII). http://eprints.gla.ac.uk/131460. Accessed 01 Oct 2017
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: The IEEE conference on computer vision and pattern recognition (CVPR)
King G, Zeng L (2001) Logistic regression in rare events data. Polit Anal 9(2):137–163. https://doi.org/10.1093/oxfordjournals.pan.a004868
Ma M, Fan H, Kitani KM (2016) Going deeper into first-person activity recognition. In: The IEEE conference on computer vision and pattern recognition (CVPR)
Martin-Lesende I, Vrotsou K, Vergara I, Bueno A, Diez A et al (2015) Design and validation of the vida questionnaire, for assessing instrumental activities of daily living in elderly people. J Gerontol Geriatr Res 4(214):2
Mukhopadhyay SC (2015) Wearable sensors for human activity monitoring: a review. IEEE Sens J 15(3):1321–1330
Nguyen THC, Nebel JC, Florez-Revuelta F (2016) Recognition of activities of daily living with egocentric vision: a review. Sensors (Basel) 16(1):72. 10.3390/s16010072. http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4732105/
Oliveira-Barra G, Dimiccoli M, Radeva P (2017) Leveraging activity indexing for egocentric image retrieval. In: Iberian conference on pattern recognition and image analysis. Springer, pp 295–303
Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V, Vanderplas J, Passos A, Cournapeau D, Brucher M, Perrot M, Duchesnay E (2011) Scikit-learn: machine learning in Python. J Mach Learn Res 12:2825–2830
Pirsiavash H, Ramanan D (2012) Detecting activities of daily living in first-person camera views. In: IEEE conference on computer vision and pattern recognition (CVPR), 2012. IEEE, pp 2847–2854
Schüssler-Fiorenza Rose SM, Stineman MG, Pan Q, Bogner H, Kurichi JE, Streim JE, Xie D (2016) Potentially avoidable hospitalizations among people at different activity of daily living limitation stages. Health Serv Res 52:132–155
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. CoRR arxiv: abs/1409.1556
Singh S, Arora C, Jawahar CV (2016) First person action recognition using deep learned descriptors. In: The IEEE conference on computer vision and pattern recognition (CVPR)
Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z (2016) Rethinking the inception architecture for computer vision. In: The IEEE conference on computer vision and pattern recognition (CVPR)
Yue-Hei Ng J, Hausknecht M, Vijayanarasimhan S, Vinyals O, Monga R, Toderici G (2015) Beyond short snippets: deep networks for video classification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4694–4702
Acknowledgements
A.C. was supported by a doctoral fellowship from the Mexican Council of Science and Technology (CONACYT) (Grant No. 366596). This work was partially founded by TIN2015-66951-C2, SGR 1219, CERCA, ICREA Academia’2014 and 20141510 (Marató TV3). The funders had no role in the study design, data collection, analysis, and preparation of the manuscript. M.D. is grateful to the NVIDIA donation program for its support with GPU card.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Cartas, A., Marín, J., Radeva, P. et al. Batch-based activity recognition from egocentric photo-streams revisited. Pattern Anal Applic 21, 953–965 (2018). https://doi.org/10.1007/s10044-018-0708-1
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10044-018-0708-1