Skip to main content
Log in

A multi-label movie genre classification scheme based on the movie’s subtitles

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Prediction of movie genres is an intriguing problem that has several applications in designing recommendation systems for the audiences, analyzing movie box office performance and understanding the theme of the movie to list some. This is a classic multi-label classification problem. An algorithm for movie genre detection has been proposed built on the yet unused movie’s subtitles which are a documented account of the movie’s visual content and dialogues. The basic idea is to identify words that have high frequency in a particular genre and use them as features for training the classification machine learning models. The performance of the algorithm was tested on English subtitles of 964 movies of six genres: Action, Fantasy, Horror, Romance, Sports and War. Experiments were conducted with varied number of features and six machine learning models. The best result was obtained using K-Nearest Neighbour (kNN) with the average precision for all genres being 77.7% with 200 features. Another noteworthy result was an average precision of 75.2% using kNN with merely 50 features. The algorithm performed very well for the genres: Sports and War with above 90% precision in some cases.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

Code Availability

The code can be shared if required.

References

  1. Austin A, Moore E, Gupta U, Chordia P (2010) Characterization of movie genre based on music score. In: 2010 IEEE international conference on acoustics, speech and signal processing, IEEE, pp 421–424

  2. Bhatt RB (2009) Neuro-fuzzy decision trees for content popularity model and multi-genre movie recommendation system over social network. In: TENCON 2009-2009 IEEE region 10 conference, IEEE, pp 1–6

  3. Choi SM, Ko SK, Han YS (2012) A movie recommendation algorithm based on genre correlations. Expert Syst Appl 39(9):8079–8085

    Article  Google Scholar 

  4. Chu WT, Guo HJ (2017) Movie genre classification based on poster images with deep neural networks. In: Proceedings of the workshop on multimodal understanding of social, Affective and Subjective Attributes, ACM, pp 39–45

  5. Ding D, Yang J, Li Q, Wang L, Wenyin L (2004) Automatic detection of flash movie genre using bayesian approach. In: 2004 IEEE International conference on multimedia and expo (ICME)(IEEE cat. no. 04TH8763), vol 1. IEEE, pp 603–606

  6. Doshi P, Zadrozny W (2018) Movie genre detection using topological data analysis. In: International conference on statistical language and speech processing, Springer, pp 117–128

  7. Doudpota SM, Guha S, Baber J (2013) Mining movies for song sequences with video based music genre identification system. Inform Process Manage 49 (2):529–544

    Article  Google Scholar 

  8. Fan J, Zhou N, Peng J, Gao L (2015) Hierarchical learning of tree classifiers for large-scale plant species identification. IEEE Trans Image Process 24 (11):4172–4184

    Article  MathSciNet  Google Scholar 

  9. Han Y, Kim Y (2017) An extracting method of movie genre similarity using aspect-based approach in social media. ACM SIGAPP Applied Computing Review 17(2):36–45

    Article  Google Scholar 

  10. Haq IU, Muhammad K, Ullah A, Baik SW (2019) Deepstar: Detecting starring characters in movies. IEEE Access 7:9265–9272

    Article  Google Scholar 

  11. Hong HZ, Hwang JIG (2015) Multimodal plsa for movie genre classification. In: International workshop on multiple classifier systems, Springer, pp 159–167

  12. Huang YF, Wang SH (2012) Movie genre classification using svm with audio and video features. In: International conference on active media technology, Springer, pp 1–10

  13. Hwang TG, Park CS, Hong JH, Kim SK (2016) An algorithm for movie classification and recommendation using genre correlation. Multimed Tools Appl 75(20):12843–12858

    Article  Google Scholar 

  14. Irie G, Satou T, Kojima A, Yamasaki T, Aizawa K (2010) Affective audio-visual words and latent topic driving model for realizing movie affective scene classification. IEEE Transactions on Multimedia 12(6):523–535

    Article  Google Scholar 

  15. Ivasic-Kos M, Miran P, Luka M (2014) Movie posters classification into genres based on low-level features. In: 2014 37th international convention on information and communication technology, electronics and microelectronics (MIPRO),IEEE, pp 1198–1203

  16. Kaimann D (2013) ’to infinity and beyond!’-a genre-specific film analysis of movie success mechanisms. Center for International Economics Working Paper Series (2011-05)

  17. Kim KR, Moon N (2012) Recommender system design using movie genre similarity and preferred genres in smartphone. Multimed Tools Appl 61(1):87–104

    Article  Google Scholar 

  18. Païs G, Lambert P, Beauchêne D, Deloule F, Ionescu B (2012) Animated movie genre detection using symbolic fusion of text and image descriptors. In: 2012 10th international workshop on content-based multimedia indexing (CBMI), IEEE, pp 1–6

  19. Rasheed Z, Shah M (2002) Movie genre classification by exploiting audio-visual features of previews. In: Object recognition supported by user interaction for service robots, vol 2. IEEE, pp 1086–1089

  20. Saumya S, Kumar J, Singh JP (2018) Genre fraction detection of a movie using text mining. In: Advanced Computing and Systems for Security, Springer, pp 167–177

  21. ScikitLearn (Accessed: 2020) SelectKBest

  22. Shon JH, Kim YG, Yim SJ (2012) Dissecting movie genres from an audience perspective: Mti movie classification method

  23. Sirattanajakarin S, Thusaranon P (2019) Movie genre in multi-label classification using semantic extraction from only movie poster. In: Proceedings of the 2019 7th international conference on computer and communications management, ACM, pp 23–27

  24. Ul Haq I, Ullah A, Muhammad K, Lee MY, Baik SW (2019) Personalized movie summarization using deep cnn-assisted facial expression recognition. Complexity 2019

  25. Wang L, Qian X, Zhang Y, Shen J, Cao X (2019) Enhancing sketch-based image retrieval by cnn semantic re-ranking. IEEE Trans Cybern 50 (7):3330–3342

    Article  Google Scholar 

  26. Wehrmann J, Barros RC (2017) Movie genre classification: A multi-label approach based on convolutions through time. Appl Soft Comput 61:973–982

    Article  Google Scholar 

  27. Xu B, Fu Y, Jiang YG, Li B, Sigal L (2016) Heterogeneous knowledge transfer in video emotion recognition, attribution and summarization. IEEE Trans Affect Comput 9(2):255–270

    Article  Google Scholar 

  28. Zhou H, Hermans T, Karandikar AV, Rehg JM (2010) Movie genre classification via scene categorization. In: Proceedings of the 18th ACM international conference on Multimedia, ACM, pp 747–750

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Bhavya Ahuja Grover.

Ethics declarations

Conflict of Interests

The authors declare that they have no conflict of interest.

Additional information

Availability of data and material

All data has been taken from yifysubtitles.com.

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Appendix

Table 1 Results for ThreshLow = 100
Table 2 Results for ThreshLow = 500
Table 3 Results for ThreshLow = 500
Table 4 Results for ThreshLow = 500
Table 5 Results for ThreshLow = 1000

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Rajput, N.K., Grover, B.A. A multi-label movie genre classification scheme based on the movie’s subtitles. Multimed Tools Appl 81, 32469–32490 (2022). https://doi.org/10.1007/s11042-022-12961-6

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-022-12961-6

Keywords

Navigation