Skip to main content
Log in

Eratosthenes sieve based key-frame extraction technique for event summarization in videos

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

The rapid growth of video data demands both effective and efficient video summarization methods so that users are empowered to quickly browse and comprehend a large amount of video content. It is a herculean task to manage access to video content in real time where humongous amount of audiovisual recorded data is generated every second. In this paper we propose an Eratosthenes Sieve based key-frame extraction approach for video summarization (VS) which can work better for real-time applications. Here, Eratosthenes Sieve is used to generate sets of all Prime number frames and nonprime number frames up to total N frames of a video. k-means clustering procedure is employed on these sets to extract the key–frames quickly. Here, the challenge is to find the optimal set of clusters, achieved by employing Davies-Bouldin Index (DBI). DBI a cluster validation technique which allows users with free parameter based VS approach to choose the desired number of key-frames without incurring additional computational costs. Moreover, our proposed approach includes likes of both local and global perspective videos. The method strongly enhances clustering procedure performance trough engagement of Eratosthenes Sieve. Qualitative and quantitative evaluation and complexity computation are done in order to compare the performances of the proposed model and state-of-the-art models. Experimental results on two benchmark datasets with various types of videos exhibit that the proposed methods outperform the state-of-the-art models on F-measure.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Notes

  1. http://www.open-video.org.

  2. https://sites.google.com/site/vsummsite/download.

References

  1. Achanta R, Hemami S, Estrada F, Süsstrunk S (2009) Frequency-tuned salient region detection IEEE international conference on computer vision and pattern recogition, pp 1597–1604

    Google Scholar 

  2. Arya R, Singh N, Agrawal R K (2015) A novel hybrid approach for salient object detection using local and global saliency in frequency domain. Multimed Tools App:1–21

  3. Assfalg J, Bertini M, Colombo C, Bimbo A D, Nunziati W (2003) Semantic annotation of soccer videos: Automatic highlights identification. Comp Vision Image Underst 92(2):285–305

    Article  Google Scholar 

  4. Brunelli R, Mich O, Modena C M (1999) A survey on the automatic indexing of video data. J Vis Comm Image Represent 10(2):78–112

    Article  Google Scholar 

  5. Chang H S, Sull S, Lee S U (1999) Efficient video indexing scheme for content-based retrieval. IEEE Trans Circuits Syst for Video Tech 9(8):1269–1279

    Article  Google Scholar 

  6. Chang P, Han M, Gong E (2002) Extract highlights from baseball game video with hidden markov models Proceedings of the IEEE international conference on image processing (ICIP 2002), vol 1, pp I–609

  7. Changsheng X et al (2008) A novel framework for semantic annotation and personalized retrieval of sports video. IEEE Trans Multimedia 10(3):421–436

    Article  Google Scholar 

  8. Chen F, De Vleeschouwer C, Cavallaro A (2014) Resource allocation for personalized video summarization. IEEE Trans Multimedia 16(2):455–469

    Article  Google Scholar 

  9. Chowdhury A S, Kuanar S K, Panda R, Das Moloy N (2012) Video storyboard design using Delaunay graphs International conference on pattern recogition (ICPR 2012), pp 3108–3111

    Google Scholar 

  10. Cong Y, Yuan J, Luo J (2012) Towards scalable summarization of consumer videos via sparse dictionary selection. IEEE Trans Multimed 14(1):66–75

    Article  Google Scholar 

  11. Dagtas S, Abdel-Mottaleb M (2004) Multimodal detection of highlights for multimedia content. Multimed Syst 9:586–593

    Article  Google Scholar 

  12. de Avila S E F, Lopes A P B et al (2011) Vsumm: a mechanism designed to produce static video summaries and a novel evaluation method. Pattern Recog Lett 32 (1):56–68

  13. de Avila S E F, Lopes A P B et al (2011) Vsumm: a mechanism designed to produce static video summaries and a novel evaluation method. Pattern Recognit Lett 32(1):56–68

  14. Doulamis A D, Doulamis N D, Kollias S D (2000) A fuzzy video content representation for video summarization and content-based retrieval. Sign Process 80 (6):1049–1067

    Article  MATH  Google Scholar 

  15. Furini M, Geraci F, Montangero M, Pellegrini M (2010) Stimo: still and moving video storyboard for the web scenario. Multimed Tools Appl 46(1):47–69

    Article  Google Scholar 

  16. Gao L, Song J et al (2016) Optimized graph learning using partial tags and multiple features for image and video annotation 14th European conference on computer vision

    Google Scholar 

  17. Gao L, Song J, Liu X, Shao J, Liu J, Shao J (2015) Learning in high-dimensional multimedia data: the state of the art. Multimedia Systems:1–11

  18. Gao Y, Zhao S, Yang Y, Chua T-S (2015) Multimedia social event detection in microblog International conference on multimedia modeling, Springer International Publishing, pp 269–281

    Google Scholar 

  19. Gargi U, Kasturi R, Strayer S H (2000) Performance characterization of video-shot-change detection methods. IEEE Trans Circuits System Video Tech 10 (1):1–13

    Article  Google Scholar 

  20. Gong Y, Liu X (2000) Video summarization using singular value decomposition IEEE conference on computer vision and pattern recogition, vol 2, pp 174–180

  21. Gu L, Bone D, Reynolds G (1999) Replay detection in sports video sequences Proceedings of the eurographics workshop on multimedia (multimed.’99), pp 3–9

    Google Scholar 

  22. Guan G, Wang Z, Lu S, Da Deng J, Feng D (2013) Keypoint based keyframe selection. IEEE Trans Circuits Syst Video Tech 23(4):729–734

    Article  Google Scholar 

  23. Hou X, Zhang L (2007) Saliency detection: a spectral residual approach IEEE conference on computer vision and pattern recogition IEEE, pp 1–8

    Google Scholar 

  24. John P J P (1996) Biopsychology 3rd edition (third ed.) Pearson Education, pp 170–171

  25. Kim G, Sigal L, Xing E P (2014) Joint summarization of large-scale collections of web images and videos for storyline reconstruction Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4225–4232

    Google Scholar 

  26. Mei S, Guan G et al (2015) Video summarization via minimum sparse reconstruction. Pattern Recog 48(2):522–533

    Article  Google Scholar 

  27. Money A, Agius H (2008) Video summarisation: a conceptual framework and survey of the state of the art. J Vis Comm Image Represent 19(2):121–143

    Article  Google Scholar 

  28. Mundur P, Rao Y, Yesha Y (2006) Keyframe-based video summarization using Delaunay clustering. Int J Digit Libr 6(2):219–232

    Article  Google Scholar 

  29. Nagasaka A (1991) Automatic video indexing and full-video search for object appearances Second working conference on visual database system, pp 119–133

    Google Scholar 

  30. Ouyang J-Q, Li J-T, Zhang Y-D (2003) Replay boundary detection in MPEG compressed video International conference the machine learning and cybernetics international conference IEEE, vol 5, pp 2800–2804

  31. Pan H, Beek P, Sezan M (2001) Detection of slow-motion replay segments in sports video for highlights generation Proceedings of international conference Acoustics, speech, and signal processing IEEE (ICASSP’01), vol 3, pp 1649–1652

  32. Panagiotakis C, Doulamis A, Tziritas G (2009) Equivalent key frames selection based on iso-content principles. IEEE Trans Circuits System Video Tech 19(3):447–451

    Article  Google Scholar 

  33. Ricardo B-Y, Ribeiro-Neto B (1999) Modern information retrieval. ACM Press, New York, p 463

    Google Scholar 

  34. Shahraray B, Gibbon D (1995) Automatic generation of pictorial transcripts of video programs. Proc SPIE 2417:512–518

    Article  Google Scholar 

  35. Singh N, Arya R, Agrawal R K (2016) A novel position prior using fusion of rule of thirds and image center for salient object detection. Multimedia Tools and Applications. doi:10.1007/s11042-016-3676-8

  36. Song J et al (2016) Optimized graph learning using partial tags and multiple features for image and video annotation. IEEE Trans Image Process 25(11):4999–5011

    Article  MathSciNet  Google Scholar 

  37. Sun X, Kankanhalli M S (2000) Video summarization using R-sequences. Real-Time Imaging 6(6):449–459

    Article  MATH  Google Scholar 

  38. Tjondronegoro D, Chen Y -P P, Pham B (2004) Integrating highlights for more complete sports video summarization. Proc IEEE Multimed 11(4):22–37

    Article  Google Scholar 

  39. Truong BT, Venkatesh S (2007) Video abstraction: a systematic review and classification. ACM Trans Multimed Comp Comm App 3(1, Article 3):37. doi:10.1145/1198302.1198305

    Google Scholar 

  40. Vermaak J, Pérez P, Gangnet M, Blake A (2002) Rapid summarization and browsing of video sequences British machine vision conference, pp 1–10

    Google Scholar 

  41. Video open project storyboard, https://open-video.org/results.php?size=extralarge. Retrieved May, 2016

  42. Wang H L, Cheong L -F (2006) Affective understanding in film. IEEE Trans Cir Sys Video Technol 16(6):689–704

    Article  Google Scholar 

  43. Wang P, Cai R, Yang S-Q (2004) Contextual browsing for highlights in sports video Proceedings of the international conference on multimedia and expo (ICME’04), IEEE international conference, vol 3, pp 1951–1954

  44. Wang S, Ji Q (2015) Video affective content analysis: a survey of state-of-the-art methods. IEEE Trans Affect Comput 6(4):410–430

    Article  Google Scholar 

  45. Weisstein E W (2016) Sieve of Eratosthenes, http://mathworld.wolfram.com/sieveoferatosthenes.html. Retrieved April

  46. Xiong Z, Radhakrishnan R, Divakaran A, Huang T S (2004) Effective and efficient sports highlight extraction using the minimum description length criterion in selecting GMM structures Proceedings of the international conference on multimedia and expo (ICME’04), IEEE international conference, vol 3, pp 1947–1950

  47. Xiong Z, Radhakrishnan R, Divakaran A, Rui Y, Huang T S (2005) A unified framework for video summarization, browsing & retrieval: with applications to consumer and surveillance video. Academic Press, Inc., Orlando, USA

    Book  Google Scholar 

  48. Youtube data uploading statistics, https://www.youtube.com/yt/press/statistics.html, retrieved January, 2016

  49. Zhao G et al (2016) Spatial and temporal scoring for egocentric video summarization. Neurocomputing. doi:10.1016/j.neucom.2016.03.083

  50. Zhao S et al (2013) Flexible presentation of videos based on affective content analysis International conference on multimedia modeling, vol 1. Springer, Berlin, Heidelberg, pp 368–379

  51. Zhao S et al (2016) Continuous probability distribution prediction of image emotions via multi-task shared sparse regression. IEEE Trans Multimedia:99. doi:10.1109/TMM.2016.2617741

  52. Zhao S, Gao Y, Jiang X, Yao H, Chua T-S, Sun X (2014) Exploring principles-of-art features for image emotion recognition ACM MM, pp 47–56

    Google Scholar 

  53. Zhao S, Yao H, Gao Y, Ji R, Xie W, Jiang X, Chua T-S (2016) Predicting personalized emotion perceptions of social images Proceedings of the 2016 ACM on multimedia conference (MM ’16), pp 1385–1394

    Chapter  Google Scholar 

  54. Zhuang Y, Rui Y, Huang TS, Mehrotra S (1998) Adaptive key frame extraction using unsupervised clustering Proceedings of the international conference on image processing, IEEE, vol 1, pp 866–870

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Krishan Kumar.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kumar, K., Shrimankar, D.D. & Singh, N. Eratosthenes sieve based key-frame extraction technique for event summarization in videos. Multimed Tools Appl 77, 7383–7404 (2018). https://doi.org/10.1007/s11042-017-4642-9

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-017-4642-9

Keywords

Navigation