Skip to main content
Log in

A Survey on Machine Learning Techniques in Movie Revenue Prediction

  • Review Article
  • Published:
SN Computer Science Aims and scope Submit manuscript

Abstract

With the growing number of literature on movie revenue prediction using machine learning techniques in recent years, a systemic review will help in strengthening the understanding of this research domain. Therefore, this article is aimed at determining the sources of data, the techniques, the features, and the evaluation metrics used in movie revenue prediction. We selected 36 relevant articles based defined inclusion and exclusion criteria. The review analysis found out that US cinema attracted the highest number of publications, followed by the Chinese cinema, Korean cinema, and Indian cinema in that order. We also found out that regression, classification and clustering data mining approaches were used in the reviewed articles, with regression and classification carrying the largest share. Furthermore, we observed that cast, number of screens, and genre, are the most widely used features in movie revenue prediction. We also identified multiple linear regression and support vector machines are the most commonly used prediction algorithms, while mean absolute percentage error, root-mean-square error, and average percentage hit rate are the evaluation metrics used the most. Our review identified some problems and research directions in movie revenue prediction.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

References

  1. Ahmad SR, Bakar AA, Yaakub MR. Metaheuristic algorithms for feature selection in sentiment analysis. In: 2015 science and information conference (SAI). IEEE; 2015. p. 222–26.

  2. Al-Moslmi T, Albared M, Al-Shabi A, Omar N, Abdullah S. Arabic senti-lexicon: constructing publicly available language resources for Arabic sentiment analysis. J Inf Sci. 2018;44(3):345–62.

    Google Scholar 

  3. Al-Moslmi T, Omar N, Abdullah S, Albared M. Approaches to cross-domain sentiment analysis: a systematic literature review. IEEE Access. 2017;5:16173–92.

    Google Scholar 

  4. Alloghani M, Aljaaf A, Hussain A, Baker T, Mustafina J, Al-Jumeily D, Khalaf M. Implementation of machine learning algorithms to create diabetic patient re-admission profiles. BMC Med Inform Decis Mak. 2019;19(9):253.

    Google Scholar 

  5. Antipov EA, Pokryshevskaya EB. Are box office revenues equally unpredictable for all movies? Evidence from a random forest-based model. J Revenue Pricing Manag. 2017;16(3):295–307.

    Google Scholar 

  6. Asur S, Huberman BA. Predicting the future with social media. In: 2010 IEEE/WIC/ACM international conference on web intelligence and intelligent agent technology, vol. 1. 2010. p. 492–499.

  7. Awwalu J, Bakar AA, Yaakub MR. Hybrid N-gram model using Naïve Bayes for classification of political sentiments on Twitter. Neural Comput Appl. 2019;31(12):9207–20.

    Google Scholar 

  8. Bhattacharjee B, Sridhar A, Dutta A. Identifying the causal relationship between social media content of a Bollywood movie and its box-office success—a text mining approach. Int J Bus Inf Syst. 2017;24(3):344–68.

    Google Scholar 

  9. Chen R, Xu W, Zhang X. Dynamic box office forecasting based on microblog data. Filomat. 2016;30(15):4111–24.

    MATH  Google Scholar 

  10. Derrick FW, Williams NA, Scott CE. A two-stage proxy variable approach to estimating movie box office receipts. J Cult Econ. 2014;38(2):173–89.

    Google Scholar 

  11. Du J, Xu H, Huang X. Box office prediction based on microblog. Expert Syst Appl. 2014;41(4 PART 2):1680–9.

    Google Scholar 

  12. Du J, Hua X, Huang X. Box office prediction based on microblog. Expert Syst Appl. 2014;41(4):1680–9.

    Google Scholar 

  13. Duan J, Ding X, Liu T. “A Gaussian copula regression model for movie box-office revenue prediction with social media” edited by W. Z. H. X. Sun M. Zhang X. Commun Comput Inf Sci. 2015;568:28–37.

    Google Scholar 

  14. Duan J, Ding X, Liu T. A Gaussian copula regression model for movie box-office revenues prediction. Sci China Inf Sci. 2017;60(9):092103.

    Google Scholar 

  15. Gaikar DD, Marakarkandy B, Dasgupta C. Using Twitter data to predict the performance of Bollywood movies. Ind Manag Data Syst. 2015;115(9):1604–21.

    Google Scholar 

  16. Ghiassi M, Lio D, Moon B. Pre-production forecasting of movie revenues with a dynamic artificial neural network. Expert Syst Appl. 2015;42(6):3176–93.

    Google Scholar 

  17. Guo Z, Zhang X, Hou Y. “Predicting box office receipts of movies with pruned random forest” edited by H. T. A. S. Lai W.K. Liu Q. Lect Notes Comput Sci (Incl Subser Lect Notes Artif Intell Lect Notes Bioinform). 2015;9489:55–62.

    Google Scholar 

  18. Guo Z, Zhang X, Hou Y. Predicting box office receipts of movies with pruned random forest. In: Neural information processing. ICONIP 2015. Lecture notes in computer science. Cham: Springer; 2015. p. 55–62.

  19. Hossein N, Miller DW. Predicting motion picture box office performance using temporal Tweet patterns. Int J Intell Comput Cybern. 2018;11(1):64–80.

    Google Scholar 

  20. Hu YH, Shiau WM, Shih SP, Chen CJ. Considering online consumer reviews to predict movie box-office performance between the years 2009 and 2014 in the US. Electron Libr. 2018;36(6):1010–26.

    Google Scholar 

  21. Hur M, Kang P, Cho S. Box-office forecasting based on sentiments of movie reviews and independent subspace method. Inf Sci. 2016;372:608–24.

    Google Scholar 

  22. Keele S. Guidelines for performing systematic literature reviews in software engineering (Vol. 5). Technical report. 2015; Ver. 2.3 EBSE Technical Report. EBSE.

  23. Kim D, Kim D, Hwang E, Choi HG. A user opinion and metadata mining scheme for predicting box office performance of movies in the social network environment. N Rev Hypermed Multimed. 2013;19(3–4):259–72.

    Google Scholar 

  24. Kim T, Hong J, Kang P. Box office forecasting using machine learning algorithms based on SNS data. Int J Forecast. 2015;31(2):364–90.

    Google Scholar 

  25. Kim T, Hong J, Kang P. Box office forecasting considering competitive environment and word-of-mouth in social networks: a case study of korean film market. Comput Intell Neurosci. 2017.

  26. Lash MT, Fu S, Wang S, Zhao K. Early prediction of movie success what, who, and when. In: Social computing, behavioral-cultural modeling, and prediction. 2015. p. 345–49.

  27. Lash MT, Zhao K. Early predictions of movie success: the who, what, and when of profitability. J Manag Inf Syst. 2016;33(3):874–903.

    Google Scholar 

  28. Lee K, Park J, Kim I, Choi Y. Predicting movie success with machine learning techniques: ways to improve accuracy. Inf Syst Front. 2018;20(3):577–88.

    Google Scholar 

  29. Lehrer S, Xie T. Box office buzz: does social media data steal the show from model uncertainty when forecasting for Hollywood? Rev Econ Stat. 2017;99(5):749–55.

    Google Scholar 

  30. Lipizzi C, Iandoli L, Marquez JER. Combining structure, content and meaning in online social networks: the analysis of public’s early reaction in social media to newly launched movies. Technol Forecast Soc Chang. 2016;109:35–49.

    Google Scholar 

  31. Liu T, Ding X, Chen Y, Chen H, Guo M. Predicting movie box-office revenues by exploiting large-scale social media content. Multimed Tools Appl. 2016;75(3):1509–28.

    Google Scholar 

  32. Lu Y, Wang F, Maciejewski R. Business intelligence from social media: a study from the VAST box office challenge. IEEE Comput Gr Appl. 2014;34(5):58–69.

    Google Scholar 

  33. Mestyan M, Yasseri T, Kertesz J. Early prediction of movie box office success based on wikipedia activity big data. PLOS ONE. 2013;8(8):e71226.

    Google Scholar 

  34. Mohanty S, Clements N, Gupta V. Investigating the effect of ewom in movie box office success through an aspect-based approach. Int J Bus Anal. 2018;5(1):1–15.

    Google Scholar 

  35. Moher D, Liberati A, Tetzlaff J, Altman DG, PRISMA Group. Preferred reporting items for systematic reviews and meta-analyses: the PRISMA statement. BMJ (Clin Res Ed.). 2009;339:b2535.

    Google Scholar 

  36. Oh C. Customer engagement, word-of-mouth and box office: the case of movie Tweets. Int J Inf Syst Change Manag. 2013;6(4):338–52.

    Google Scholar 

  37. Oh C, Roumani Y, Nwankpa JK, Hu HF. Beyond likes and Tweets: consumer engagement behavior and movie box office in social media. Inf Manag. 2017;54(1):25–37.

    Google Scholar 

  38. Parimi R, Caragea D. Pre-release box-office success prediction for motion pictures. In: Lecture notes in computer science (including subseries lecture notes in artificial intelligence and lecture notes in bioinformatics). vol. 7988. LNAI; 2013. p. 571–85.

  39. Park S, Kim T. Forecasting audience of motion pictures considering competitive environment. J Theor Appl Inf Technol. 2017;95(18):4340–8.

    Google Scholar 

  40. Quader N, Gani MO, Chaki D, Ali MH. A machine learning approach to predict movie box-office success. In: 20th international conference of computer and information technology, ICCIT 2017, vols. 2018-Janua. Institute of Electrical and Electronics Engineers Inc.; 2018. p. 1–7

  41. Quader N, Osman Gani Md, Chaki D, Haider Ali Md. A machine learning approach to predict movie box-office success. In: 2017 20th international conference of computer and information technology, ICCIT 2017. IEEE; 2017.

  42. Riwinoto MT, Zega SA, Irlanda G. Predicting animated film of box-office success with neural networks. J Teknol. 2015;77(23):77–82.

    Google Scholar 

  43. Ru Y, Bo Li, Liu J, Chai J. An effective daily box office prediction model based on deep neural networks. Cognit. Syst Res. 2018;52:182–91.

    Google Scholar 

  44. Ruhrländer RP, Boissier M, Uflacker M. Improving box office result predictions for movies using consumer-centric models. In: KDD ’18 proceedings of the 24th ACM SIGKDD international conference on knowledge discovery and data mining, KDD ’18. London: ACM New York, NY, USA; 2018. p. 655–64.

  45. Sachdev S, Agrawal A, Bhendarkar S, Prasad BR, Agarwal S. Movie box-office gross revenue estimation. Adv Intell Syst Comput. 2018;709:9–17.

    Google Scholar 

  46. Wang W, Xiu J, Yang Z, Liu C. A deep learning model for predicting movie box office based on deep belief network, edited by S. Y. Tan Y. Tang Q. Advances in swarm intelligence. ICSI 2018. Lect Notes Comput Sci. 2018;10942:530–41.

  47. Wang Y, Ru Y, Chai J. Time series clustering based on sparse subspace clustering algorithm and its application to daily box-office data analysis. Neural Comput Appl. 2018;31(9):4809–18.

    Google Scholar 

  48. Xiao J, Li X, Chen S, Zhao X. Meng Xu. An inside look into the complexity of box-office revenue prediction in China. Int J Distrib Sens Netw. 2017;13(1):1–14.

    Google Scholar 

  49. Yaakub MR, Li Y, Zhang J. Integration of sentiment analysis into customer relational model: the importance of feature ontology and synonym. Proc Technol. 2014;11:495–501.

    Google Scholar 

  50. Zhou Y, Zhang L, Yi Z. Predicting movie box-office revenues using deep neural networks. Neural Comput Appl. 2017;31(6):1855–65.

    Google Scholar 

Download references

Acknowledgements

This work was supported in part by the Ministry of Higher Education, Malaysia, under Grant FRGS/1/2017/ICT02/UKM/02/4.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ibrahim Said Ahmad.

Ethics declarations

Conflict of interest

On behalf of all authors, the corresponding author states that there is no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix 1

Appendix 1

See Tables 1, 2, 3, and 4.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ahmad, I.S., Bakar, A.A., Yaakub, M.R. et al. A Survey on Machine Learning Techniques in Movie Revenue Prediction. SN COMPUT. SCI. 1, 235 (2020). https://doi.org/10.1007/s42979-020-00249-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s42979-020-00249-1

Keywords

Navigation