Skip to main content
Log in

Batch-based activity recognition from egocentric photo-streams revisited

  • Original Article
  • Published:
Pattern Analysis and Applications Aims and scope Submit manuscript

Abstract

Wearable cameras can gather large amounts of image data that provide rich visual information about the daily activities of the wearer. Motivated by the large number of health applications that could be enabled by the automatic recognition of daily activities, such as lifestyle characterization for habit improvement, context-aware personal assistance and tele-rehabilitation services, we propose a system to classify 21 daily activities from photo-streams acquired by a wearable photo-camera. Our approach combines the advantages of a late fusion ensemble strategy relying on convolutional neural networks at image level with the ability of recurrent neural networks to account for the temporal evolution of high-level features in photo-streams without relying on event boundaries. The proposed batch-based approach achieved an overall accuracy of 89.85%, outperforming state-of-the-art end-to-end methodologies. These results were achieved on a dataset consists of 44,902 egocentric pictures from three persons captured during 26 days in average.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

Notes

  1. The annotations are publicly available at https://www.github.com/gorayni/egocentric_photostreams.

References

  1. Bolaños M, Dimiccoli M, Radeva P (2017) Toward storytelling from visual lifelogging: an overview. IEEE Trans Hum Mach Syst 47(1):77–90

    Google Scholar 

  2. Breiman L, Friedman J, Stone CJ, Olshen RA (1984) Classification and regression trees. CRC Press, Boca Raton

    MATH  Google Scholar 

  3. Cartas A, Dimiccoli M, Radeva P (2017) Batch-based activity recognition from egocentric photo-streams. In: Proceedings of the international conference on computer vision (ICCV), workshop on egocentric perception, interaction and computing. IEEE

  4. Cartas A, Marín J, Radeva P, Dimiccoli M (2017) Recognizing activities of daily living from egocentric images. In: Proceedings of the Iberian conference on pattern recognition and image analysis (IbPRIA). Springer, Cham, pp 87–95

    Chapter  Google Scholar 

  5. Castro D, Hickson S, Bettadapura V, Thomaz E, Abowd G, Christensen H, Essa I (2015) Predicting daily activities from egocentric images using deep learning. In: Proceedings of the 2015 ACM international symposium on wearable computers. ACM, pp 75–82

  6. Chollet F et al (2015) Keras. https://github.com/fchollet/keras

  7. Dimiccoli M, Bolaños M, Talavera E, Aghaei M, Nikolov SG, Radeva P (2016) Sr-clustering: semantic regularized clustering for egocentric photo streams segmentation. Comput Vis Image Underst 155:55–69

    Article  Google Scholar 

  8. Donahue J, Hendricks LA, Guadarrama S, Rohrbach M, Venugopalan S, Saenko K, Darrell T (2015) Long-term recurrent convolutional networks for visual recognition and description. In: CVPR

  9. Fathi A, Farhadi A, Rehg JM (2011) Understanding egocentric activities. In: 2011 international conference on computer vision. IEEE, pp 407–414

  10. Fathi A, Li Y, Rehg JM (2012) Learning to recognize daily actions using gaze. In: European conference on computer vision. Springer, pp 314–327

  11. Gurrin C, Joho H, Hopfgartner F, Zhou L, Albatal R (2017) Overview of NTCIR-12 lifelog task. In: NTCIR-12. National Institute of Informatics (NII). http://eprints.gla.ac.uk/131460. Accessed 01 Oct 2017

  12. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: The IEEE conference on computer vision and pattern recognition (CVPR)

  13. King G, Zeng L (2001) Logistic regression in rare events data. Polit Anal 9(2):137–163. https://doi.org/10.1093/oxfordjournals.pan.a004868

    Article  Google Scholar 

  14. Ma M, Fan H, Kitani KM (2016) Going deeper into first-person activity recognition. In: The IEEE conference on computer vision and pattern recognition (CVPR)

  15. Martin-Lesende I, Vrotsou K, Vergara I, Bueno A, Diez A et al (2015) Design and validation of the vida questionnaire, for assessing instrumental activities of daily living in elderly people. J Gerontol Geriatr Res 4(214):2

    Google Scholar 

  16. Mukhopadhyay SC (2015) Wearable sensors for human activity monitoring: a review. IEEE Sens J 15(3):1321–1330

    Article  Google Scholar 

  17. Nguyen THC, Nebel JC, Florez-Revuelta F (2016) Recognition of activities of daily living with egocentric vision: a review. Sensors (Basel) 16(1):72. 10.3390/s16010072. http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4732105/

    Article  Google Scholar 

  18. Oliveira-Barra G, Dimiccoli M, Radeva P (2017) Leveraging activity indexing for egocentric image retrieval. In: Iberian conference on pattern recognition and image analysis. Springer, pp 295–303

  19. Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V, Vanderplas J, Passos A, Cournapeau D, Brucher M, Perrot M, Duchesnay E (2011) Scikit-learn: machine learning in Python. J Mach Learn Res 12:2825–2830

    MathSciNet  MATH  Google Scholar 

  20. Pirsiavash H, Ramanan D (2012) Detecting activities of daily living in first-person camera views. In: IEEE conference on computer vision and pattern recognition (CVPR), 2012. IEEE, pp 2847–2854

  21. Schüssler-Fiorenza Rose SM, Stineman MG, Pan Q, Bogner H, Kurichi JE, Streim JE, Xie D (2016) Potentially avoidable hospitalizations among people at different activity of daily living limitation stages. Health Serv Res 52:132–155

    Article  Google Scholar 

  22. Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. CoRR arxiv: abs/1409.1556

  23. Singh S, Arora C, Jawahar CV (2016) First person action recognition using deep learned descriptors. In: The IEEE conference on computer vision and pattern recognition (CVPR)

  24. Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z (2016) Rethinking the inception architecture for computer vision. In: The IEEE conference on computer vision and pattern recognition (CVPR)

  25. Yue-Hei Ng J, Hausknecht M, Vijayanarasimhan S, Vinyals O, Monga R, Toderici G (2015) Beyond short snippets: deep networks for video classification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4694–4702

Download references

Acknowledgements

A.C. was supported by a doctoral fellowship from the Mexican Council of Science and Technology (CONACYT) (Grant No. 366596). This work was partially founded by TIN2015-66951-C2, SGR 1219, CERCA, ICREA Academia’2014 and 20141510 (Marató TV3). The funders had no role in the study design, data collection, analysis, and preparation of the manuscript. M.D. is grateful to the NVIDIA donation program for its support with GPU card.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Alejandro Cartas.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Cartas, A., Marín, J., Radeva, P. et al. Batch-based activity recognition from egocentric photo-streams revisited. Pattern Anal Applic 21, 953–965 (2018). https://doi.org/10.1007/s10044-018-0708-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10044-018-0708-1

Keywords

Navigation