Back to articles
Articles
Volume: 33 | Article ID: art00003
Image
ATTENTION-BASED LSTM NETWORK FOR ACTION RECOGNITION IN SPORTS
  DOI :  10.2352/ISSN.2470-1173.2021.6.IRIACV-302  Published OnlineJanuary 2021
Abstract

Understanding human action from the visual data is an important computer vision application for video surveillance, sports player performance analysis, and many IoT applications. The traditional approaches for action recognition used hand-crafted visual and temporal features for classifying specific actions. In this paper, we followed the standard deep learning framework for action recognition but introduced channel and spatial attention module sequentially in the network. In a nutshell, our network consists of four main components. First, the input frames are given to a pre-trained CNN for extracting the visual features and the visual features are passed through the attention module. The transformed features maps are given to the bi-directional LSTM network that exploits the temporal dependency among the frames for the underlying action in the scene. The output of bi-direction LSTM is given to a fully connected layer with a softmax classifier that assigns the probabilities to the actions of the subject in the scene. In addition to cross-entropy loss, the marginal loss function is used that penalizes the network for the inter action classes and complimenting the network for the intra action variations. The network is trained and validated on a tennis dataset and in total six tennis players' actions are focused. The network is evaluated on standard performance metrics (precision, recall) promising results are achieved.

Subject Areas :
Views 137
Downloads 26
 articleview.views 137
 articleview.downloads 26
  Cite this article 

Mohib Ullah, Muhammad Mudassar Yamin, Ahmed Mohammed, Sultan Daud Khan, Habib Ullah, Faouzi Alaya Cheikh, "ATTENTION-BASED LSTM NETWORK FOR ACTION RECOGNITION IN SPORTSin Proc. IS&T Int’l. Symp. on Electronic Imaging: Intelligent Robotics and Industrial Applications using Computer Vision,  2021,  pp 302-1 - 302-6,  https://doi.org/10.2352/ISSN.2470-1173.2021.6.IRIACV-302

 Copy citation
  Copyright statement 
Copyright © Society for Imaging Science and Technology 2021
72010604
Electronic Imaging
2470-1173
Society for Imaging Science and Technology
IS&T 7003 Kilworth Lane Springfield, VA 22151 USA