Abstract
Motion trajectories provide rich spatio-temporal information about an object's activity. The trajectory information can be obtained using a tracking algorithm on data streams available from a range of devices including motion sensors, video cameras, haptic devices, etc. Developing view-invariant activity recognition algorithms based on this high dimensional cue is an extremely challenging task. This paper presents efficient activity recognition algorithms using novel view-invariant representation of trajectories. Towards this end, we derive two Affine-invariant representations for motion trajectories based on curvature scale space (CSS) and centroid distance function (CDF). The properties of these schemes facilitate the design of efficient recognition algorithms based on hidden Markov models (HMMs). In the CSS-based representation, maxima of curvature zero crossings at increasing levels of smoothness are extracted to mark the location and extent of concavities in the curvature. The sequences of these CSS maxima are then modeled by continuous density (HMMs). For the case of CDF, we first segment the trajectory into subtrajectories using CDF-based representation. These subtrajectories are then represented by their Principal Component Analysis (PCA) coefficients. The sequences of these PCA coefficients from subtrajectories are then modeled by continuous density hidden Markov models (HMMs). Different classes of object motions are modeled by one Continuous HMM per class where state PDFs are represented by GMMs. Experiments using a database of around 1750 complex trajectories (obtained from UCI-KDD data archives) subdivided into five different classes are reported.
Similar content being viewed by others
References
Bashir, F., Khanvilkar, S., Schonfeld, D., Khokhar, A.: Multimedia systems: content-based indexing and retrieval. In: Chen, W.K. (ed.) The Electrical Engineering Handbook, Sect. 4, Chapter 6. Academic Press (2004)
Bashir, F., Khokhar, A., Schonfeld, D.: Segmented trajectory based indexing and retrieval of video data. In: International Conference on Image Processing. Barcelona, Spain (2003)
Bashir, F., Khokhar, A., Schonfeld, D.: A hybrid system for affine-invariant trajectory retrieval. ACM SIGMM Multimedia Information Retrieval Workshop, New York, NY (2004)
Bashir, F., Khokhar, A.: Curvature scale space based affine-invariant trajectory retrieval. In: IEEE International Multitopic Conference, INMIC 2004. Lahore, Pakistan (2004)
Bashir, F., Khokhar, A., Schonfeld, D.: Automatic object trajectory-based motion recognition using gaussian mixture models. In: IEEE International Conference on Multimedia & Expo (ICME 2005). Amsterdam, the Netherlands (2005)
Bashir, F., Qu, W., Khokhar, A., Schonfeld, D.: HMM-based motion recognition system using segmented PCA. In: IEEE International Conference on Image Processing (ICIP 2005). Genoa, Italy (2005)
Brand, M., Oliver, N., Pentland, A.: Coupled hidden markov models for complex action recognition. In: Proceedings Conference on Computer Vision and Pattern Recognition, p. 994 (1997)
Buzan, D., Sclaroff, S., Kollios, G.: Extraction and clustering of motion trajectories in video. In: International Conference on Pattern Recognition (2004)
Caelli, T., McCabe, A., Briscoe, G.: Shape tracking and production using hidden markov models. Int. J. Pattern Recognit. Artificial Intell. 15(1), 197–221 (2001)
Chang, S.F., Chen, W., Meng, H.J., Sundaram, H., Zhong, D.: A fully automated content-based video search engine supporting spatiotemporal queries. IEEE Trans. Circ. Sys. Video Techn. 8(5) (1998)
Chen, L., Ozsu, M.T., Oria, V.: Symbolic representation and retrieval of moving object trajectories. ACM SIGMM Multimedia Information Retrieval Workshop. New York (2004)
Chen, T., Huang, C., Chang, C., Wang, J.: On the Use of Gaussian Mixture Model for Speaker Variability Analysis. ICSLP. Denver, Colorado (2002)
Chen, W., Chang, S.F.: Motion Trajectory Matching of Video Objects. SPIE. San Jose, CA (2000)
Cheung, S., Zakhor, A.: Fast similarity search on video sequences. In: Proceedings IEEE International Conference on Image Processing (2003)
Dagtas, S., Al-Khatib, W., Ghafoor, A., Kashyap, R.: Models for motion-based video indexing and retrieval. IEEE Trans. Image Process. 9(1), 88–101 (2000)
Dimitrova, N., Golshani, F.: Motion recovery for video content classification. ACM Trans. Inf. Syst. 13(4), 408–439
Hartley, R., Zisserman, A.: Multiple View Geometry in Computer Vision. Cambridge University Press, Cambridge (2003)
Hettich, S., Bay, S.D.: The UCI KDD Archive [http://kdd.ics.uci.edu]. University of California, Department of Information and Computer Science, Irvine, California (1999)
Hongeng, S., Nevatia, R., Bremond, F.: Video-based event recognition: Activity representation and probabilistic recognition methods. Comput. Vis. Image Understanding 96, 129–162 (2004)
Intille, S.S., Bobick, A.F.: Recognizing planned, multiperson action. Comput. Vis. Image Understanding 81, 414–445 (2001)
Isard, M., Blake, A.: A Mixed-State CONDENSATION tracker with automatic model-switching. In: Proceedings of the International Conference on Computer Vision, pp. 107–112 (1998)
Ivanov, Y.A., Bobick, A.F.: Recognition of visual activities and interactions by stochastic parsing. IEEE Trans. Pattern Anal. Machine Intell. 22(8), 852–872 (2000)
Johansson, G.: Visual perception of biological motion and a model for its analysis. Percept. Psychophys. 14(2), 201–211 (1973)
Jolliffe, I.T.: Principal Component Analysis. Springer-Verlag, New York (1986)
Katz, B., Lin, J., Stauffer, C., Grimson, E.: Answering questions about moving objects in surveillance videos. In: Proceedings of AAAI Spring Symposium on New Directions in Question Answering (2003)
Moghaddam, B., Wahid, W., Pentland, A.: Beyond EigenFaces: probabilistic matching for face recognition. In: International Conference on Automatic Face and Gesture Recognition. Nara, Japan (1998)
Mokhtarian, F., Abbasi, S.: Retrieval of similar shapes under affine transformation. In: Proceedings of the International Conference on Visual Information Systems. Amsterdam, The Netherlands, pp. 566–574 (1999)
Mokhtarian, F., Bober, M.: Curvature Scale Space Representation: Theory, Applications, and MPEG-7 Standardization. Kluwer Academic Publishers, Netherlands (2003)
Naphade, M., Kozintsev, I., Huang, T.: Factor graph framework for semantic video indexing. IEEE Trans. Circuits Syst. Video Technol. 12(1) (2002)
Oliver, N.M., Rosario, B., Pentland, A.: A bayesian computer vision system for modeling human interactions. IEEE Trans. Pattern Anal Machine Intell. 22(8), 831–843
Pentland, A., Sclaroff, S.: Modal matching for correspondence and recognition. IEEE Trans. Pattern Anal Machine Intell. 17(6), 545–561 (1995)
Porikli, F.M.: Trajectory distance metric using hidden markov model based representation. In: Sixth IEEE International Workshop on Performance Evaluation of Tracking and Surveillance, PETS (2004)
Rabiner, L.R.: A tutorial on Hidden Markov Models and selected applications in speech recognition. In: Proceedings of the IEEE, vol. 77, pp. 257–286 (1989)
Rangarajan, K., Allen, W., Shah, M.: Matching motion trajectories using scale-space. Pattern Recognit. 26(4), 595–610 (1993)
Rao, C., Yilmaz, A., Shah, M.: View-invariant representation and recognition of actions. Int. J. Comput. Vis. 50(2), 203–226 (2002)
Rea, N., Dahyot, R., Kokaram, A.: Semantic event detection in sports through motion understanding. In: Proceedings of Conference on Image and Video Retrieval. Dublin, Ireland (2004)
Sahouria, E., Zakhor, A.: A Trajectory based video indexing system for street surveillance. In: IEEE International Conference on Image Processing (1999)
Schonfeld, D., Lelescu, D.: VORTEX: Video retrieval and tracking from compressed multimedia databases—multiple object tracking from MPEG-2 bitstream (Invited Paper). J. Vis. Commun. Image Representation 11, 154–182 (2000)
Vlachos, M., Kollios, G., Gunopulos, D.: Discovering similar multidimensional trajectories. In: Proceedings of the International Conference on Data Engineering, p. 673 (2002)
Wilson, A.A., Bobick, A.F.: Hidden Markov Models for modelling and recognizing gesture under variation. Hidden Markov Models: Appl. Comput. Vis. pp. 123–160 (2001)
Xie, L., Chang, S.F., Divakaran, A., Sun, H.: Structure analysis of soccer video with Hidden Markov Models. In: IEEE International Conference on Acoustic, Speech and Signal Processing. Orlando, FL (2002)
Zhang, D.S.: Image retrieval based on shape. Ph.D Thesis, Monash University, Australia (2003)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Bashir, F.I., Khokhar, A.A. & Schonfeld, D. View-invariant motion trajectory-based activity classification and recognition. Multimedia Systems 12, 45–54 (2006). https://doi.org/10.1007/s00530-006-0024-2
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00530-006-0024-2