On Recognizing Actions in Still Images via Multiple Features

Sener, Fadime; Bas, Cagdas; Ikizler-Cinbis, Nazli

doi:10.1007/978-3-642-33885-4_27

Fadime Sener¹⁹,
Cagdas Bas²⁰ &
Nazli Ikizler-Cinbis²⁰

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 7585))

Included in the following conference series:

European Conference on Computer Vision

4222 Accesses
14 Citations

Abstract

We propose a multi-cue based approach for recognizing human actions in still images, where relevant object regions are discovered and utilized in a weakly supervised manner. Our approach does not require any explicitly trained object detector or part/attribute annotation. Instead, a multiple instance learning approach is used over sets of object hypotheses in order to represent objects relevant to the actions. We test our method on the extensive Stanford 40 Actions dataset [1] and achieve significant performance gain compared to the state-of-the-art. Our results show that using multiple object hypotheses within multiple instance learning is effective for human action recognition in still images and such an object representation is suitable for using in conjunction with other visual features.

Download to read the full chapter text

Chapter PDF

Action Recognition in the Presence of One Egocentric and Multiple Static Cameras

Discriminative Dictionary Design for Action Classification in Still Images

Human Action Recognition in Still Images

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

Yao, B., Jiang, X., Khosla, A., Lin, A.L., Guibas, L.J., Fei-Fei, L.: Human action recognition by learning bases of action attributes and parts. In: International Conference on Computer Vision (ICCV), Barcelona, Spain (November 2011)
Google Scholar
Gupta, A., Kembhavi, A., Davis, L.S.: Observing human-object interactions: Using spatial and functional compatibility for recognition. TPAMI 31, 1775–1789 (2009)
Article Google Scholar
Yao, B., Fei-Fei, L.: Modeling mutual context of object and human pose in human-object interaction activities. In: CVPR, San Francisco, CA (June 2010)
Google Scholar
Prest, A., Schmid, C., Ferrari, V.: Weakly supervised learning of interactions between humans and objects. IEEE TPAMI 34, 601–614 (2012)
Article Google Scholar
Alexe, B., Deselaers, T., Ferrari, V.: What is an object? In: IEEE Conf. on Computer Vision and Pattern Recognition, San Francisco, USA (2010)
Google Scholar
Poppe, R.: A survey on vision-based human action recognition. Image Vision Computing 28, 976–990 (2010)
Article Google Scholar
Weinland, D., Ronfard, R., Boyer, E.: A survey of vision-based methods for action representation, segmentation and recognition. CVIU 115, 224–241 (2011)
Google Scholar
Laptev, I., Marszalek, M., Schmid, C., Rozenfeld, B.: Learning realistic human actions from movies. In: CVPR (2008)
Google Scholar
Wang, Y., Jiang, H., Drew, M.S., Li, Z.N., Mori, G.: Unsupervised discovery of action classes. In: CVPR (2006)
Google Scholar
Thurau, C., Hlavac, V.: Pose primitive based human action recognition in videos or still images. In: CVPR (2008)
Google Scholar
Ikizler-Cinbis, N., Cinbis, R.G., Sclaroff, S.: Learning actions from the web. In: Int. Conf. on Computer Vision (2009)
Google Scholar
Yao, B., Fei-Fei, L.: Grouplet: a structured image representation for recognizing human and object interactions. In: The Twenty-Third IEEE Conference on Computer Vision and Pattern Recognition, San Francisco, CA (June 2010)
Google Scholar
Desai, C., Ramanan, D., Fowlkes, C.: Discriminative models for static human-object interactions. In: Workshop on Structured Models in Computer Vision (2010)
Google Scholar
Delaitre, V., Sivic, J., Laptev, I.: Learning person-object interactions for action recognition in still images. In: NIPS (2011)
Google Scholar
Delaitre, V., Laptev, I., Sivic, J.: Recognizing human actions in still images: a study of bag-of-features and part-based representations. In: BMVC (2010)
Google Scholar
Yao, B., Khosla, A., Fei-Fei, L.: Combining randomization and discrimination for fine-grained image categorization. In: CVPR, Springs, USA (June 2011)
Google Scholar
Chen, Y., Bi, J., Wang, J.Z.: Miles: Multiple-instance learning via embedded instance selection. IEEE TPAMI 28, 1931–1947 (2006)
Article Google Scholar
Patron-Perez, A., Marszalek, M., Reid, I., Zisserman, A.: High five: Recognising human interactions in tv shows. In: British Machine Vision Conference (2010)
Google Scholar
Viola, P., Jones, M.: Rapid object detection using a boosted cascade of simple features. In: CVPR (2001)
Google Scholar
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vision 60, 91–110 (2004)
Article Google Scholar
Bourdev, L., Malik, J.: Poselets: Body part detectors trained using 3D human pose annotations. In: ICCV (2009)
Google Scholar

Download references

Author information

Authors and Affiliations

Computer Engineering Department, Bilkent University, Ankara, Turkey
Fadime Sener
Computer Engineering Department, Hacettepe University, Ankara, Turkey
Cagdas Bas & Nazli Ikizler-Cinbis

Authors

Fadime Sener
View author publications
You can also search for this author in PubMed Google Scholar
Cagdas Bas
View author publications
You can also search for this author in PubMed Google Scholar
Nazli Ikizler-Cinbis
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Dipartimento di Ingegneria Elettrica, Gestionale e Meccanica (DIEGM), Università degli Studi di Udine, Via delle Scienze, 208, 33100, Udine, Italy
Andrea Fusiello
IIT Istituto Italiano di Tecnologia, Via Morego 30, 16163, Genoa, Italy
Vittorio Murino
Dipartimento di Ingegneria dell’Informazione, Università degli Studi di Modena e Reggio Emilia, Strada Vignolege, 905, 41125, Modena, Italy
Rita Cucchiara

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Sener, F., Bas, C., Ikizler-Cinbis, N. (2012). On Recognizing Actions in Still Images via Multiple Features. In: Fusiello, A., Murino, V., Cucchiara, R. (eds) Computer Vision – ECCV 2012. Workshops and Demonstrations. ECCV 2012. Lecture Notes in Computer Science, vol 7585. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-33885-4_27

Download citation

DOI: https://doi.org/10.1007/978-3-642-33885-4_27
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-33884-7
Online ISBN: 978-3-642-33885-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

On Recognizing Actions in Still Images via Multiple Features

Abstract

Chapter PDF

Similar content being viewed by others

Action Recognition in the Presence of One Egocentric and Multiple Static Cameras

Discriminative Dictionary Design for Action Classification in Still Images

Human Action Recognition in Still Images

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

On Recognizing Actions in Still Images via Multiple Features

Abstract

Chapter PDF

Similar content being viewed by others

Action Recognition in the Presence of One Egocentric and Multiple Static Cameras

Discriminative Dictionary Design for Action Classification in Still Images

Human Action Recognition in Still Images

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation