Abstract
With the growing number of literature on movie revenue prediction using machine learning techniques in recent years, a systemic review will help in strengthening the understanding of this research domain. Therefore, this article is aimed at determining the sources of data, the techniques, the features, and the evaluation metrics used in movie revenue prediction. We selected 36 relevant articles based defined inclusion and exclusion criteria. The review analysis found out that US cinema attracted the highest number of publications, followed by the Chinese cinema, Korean cinema, and Indian cinema in that order. We also found out that regression, classification and clustering data mining approaches were used in the reviewed articles, with regression and classification carrying the largest share. Furthermore, we observed that cast, number of screens, and genre, are the most widely used features in movie revenue prediction. We also identified multiple linear regression and support vector machines are the most commonly used prediction algorithms, while mean absolute percentage error, root-mean-square error, and average percentage hit rate are the evaluation metrics used the most. Our review identified some problems and research directions in movie revenue prediction.
Similar content being viewed by others
References
Ahmad SR, Bakar AA, Yaakub MR. Metaheuristic algorithms for feature selection in sentiment analysis. In: 2015 science and information conference (SAI). IEEE; 2015. p. 222–26.
Al-Moslmi T, Albared M, Al-Shabi A, Omar N, Abdullah S. Arabic senti-lexicon: constructing publicly available language resources for Arabic sentiment analysis. J Inf Sci. 2018;44(3):345–62.
Al-Moslmi T, Omar N, Abdullah S, Albared M. Approaches to cross-domain sentiment analysis: a systematic literature review. IEEE Access. 2017;5:16173–92.
Alloghani M, Aljaaf A, Hussain A, Baker T, Mustafina J, Al-Jumeily D, Khalaf M. Implementation of machine learning algorithms to create diabetic patient re-admission profiles. BMC Med Inform Decis Mak. 2019;19(9):253.
Antipov EA, Pokryshevskaya EB. Are box office revenues equally unpredictable for all movies? Evidence from a random forest-based model. J Revenue Pricing Manag. 2017;16(3):295–307.
Asur S, Huberman BA. Predicting the future with social media. In: 2010 IEEE/WIC/ACM international conference on web intelligence and intelligent agent technology, vol. 1. 2010. p. 492–499.
Awwalu J, Bakar AA, Yaakub MR. Hybrid N-gram model using Naïve Bayes for classification of political sentiments on Twitter. Neural Comput Appl. 2019;31(12):9207–20.
Bhattacharjee B, Sridhar A, Dutta A. Identifying the causal relationship between social media content of a Bollywood movie and its box-office success—a text mining approach. Int J Bus Inf Syst. 2017;24(3):344–68.
Chen R, Xu W, Zhang X. Dynamic box office forecasting based on microblog data. Filomat. 2016;30(15):4111–24.
Derrick FW, Williams NA, Scott CE. A two-stage proxy variable approach to estimating movie box office receipts. J Cult Econ. 2014;38(2):173–89.
Du J, Xu H, Huang X. Box office prediction based on microblog. Expert Syst Appl. 2014;41(4 PART 2):1680–9.
Du J, Hua X, Huang X. Box office prediction based on microblog. Expert Syst Appl. 2014;41(4):1680–9.
Duan J, Ding X, Liu T. “A Gaussian copula regression model for movie box-office revenue prediction with social media” edited by W. Z. H. X. Sun M. Zhang X. Commun Comput Inf Sci. 2015;568:28–37.
Duan J, Ding X, Liu T. A Gaussian copula regression model for movie box-office revenues prediction. Sci China Inf Sci. 2017;60(9):092103.
Gaikar DD, Marakarkandy B, Dasgupta C. Using Twitter data to predict the performance of Bollywood movies. Ind Manag Data Syst. 2015;115(9):1604–21.
Ghiassi M, Lio D, Moon B. Pre-production forecasting of movie revenues with a dynamic artificial neural network. Expert Syst Appl. 2015;42(6):3176–93.
Guo Z, Zhang X, Hou Y. “Predicting box office receipts of movies with pruned random forest” edited by H. T. A. S. Lai W.K. Liu Q. Lect Notes Comput Sci (Incl Subser Lect Notes Artif Intell Lect Notes Bioinform). 2015;9489:55–62.
Guo Z, Zhang X, Hou Y. Predicting box office receipts of movies with pruned random forest. In: Neural information processing. ICONIP 2015. Lecture notes in computer science. Cham: Springer; 2015. p. 55–62.
Hossein N, Miller DW. Predicting motion picture box office performance using temporal Tweet patterns. Int J Intell Comput Cybern. 2018;11(1):64–80.
Hu YH, Shiau WM, Shih SP, Chen CJ. Considering online consumer reviews to predict movie box-office performance between the years 2009 and 2014 in the US. Electron Libr. 2018;36(6):1010–26.
Hur M, Kang P, Cho S. Box-office forecasting based on sentiments of movie reviews and independent subspace method. Inf Sci. 2016;372:608–24.
Keele S. Guidelines for performing systematic literature reviews in software engineering (Vol. 5). Technical report. 2015; Ver. 2.3 EBSE Technical Report. EBSE.
Kim D, Kim D, Hwang E, Choi HG. A user opinion and metadata mining scheme for predicting box office performance of movies in the social network environment. N Rev Hypermed Multimed. 2013;19(3–4):259–72.
Kim T, Hong J, Kang P. Box office forecasting using machine learning algorithms based on SNS data. Int J Forecast. 2015;31(2):364–90.
Kim T, Hong J, Kang P. Box office forecasting considering competitive environment and word-of-mouth in social networks: a case study of korean film market. Comput Intell Neurosci. 2017.
Lash MT, Fu S, Wang S, Zhao K. Early prediction of movie success what, who, and when. In: Social computing, behavioral-cultural modeling, and prediction. 2015. p. 345–49.
Lash MT, Zhao K. Early predictions of movie success: the who, what, and when of profitability. J Manag Inf Syst. 2016;33(3):874–903.
Lee K, Park J, Kim I, Choi Y. Predicting movie success with machine learning techniques: ways to improve accuracy. Inf Syst Front. 2018;20(3):577–88.
Lehrer S, Xie T. Box office buzz: does social media data steal the show from model uncertainty when forecasting for Hollywood? Rev Econ Stat. 2017;99(5):749–55.
Lipizzi C, Iandoli L, Marquez JER. Combining structure, content and meaning in online social networks: the analysis of public’s early reaction in social media to newly launched movies. Technol Forecast Soc Chang. 2016;109:35–49.
Liu T, Ding X, Chen Y, Chen H, Guo M. Predicting movie box-office revenues by exploiting large-scale social media content. Multimed Tools Appl. 2016;75(3):1509–28.
Lu Y, Wang F, Maciejewski R. Business intelligence from social media: a study from the VAST box office challenge. IEEE Comput Gr Appl. 2014;34(5):58–69.
Mestyan M, Yasseri T, Kertesz J. Early prediction of movie box office success based on wikipedia activity big data. PLOS ONE. 2013;8(8):e71226.
Mohanty S, Clements N, Gupta V. Investigating the effect of ewom in movie box office success through an aspect-based approach. Int J Bus Anal. 2018;5(1):1–15.
Moher D, Liberati A, Tetzlaff J, Altman DG, PRISMA Group. Preferred reporting items for systematic reviews and meta-analyses: the PRISMA statement. BMJ (Clin Res Ed.). 2009;339:b2535.
Oh C. Customer engagement, word-of-mouth and box office: the case of movie Tweets. Int J Inf Syst Change Manag. 2013;6(4):338–52.
Oh C, Roumani Y, Nwankpa JK, Hu HF. Beyond likes and Tweets: consumer engagement behavior and movie box office in social media. Inf Manag. 2017;54(1):25–37.
Parimi R, Caragea D. Pre-release box-office success prediction for motion pictures. In: Lecture notes in computer science (including subseries lecture notes in artificial intelligence and lecture notes in bioinformatics). vol. 7988. LNAI; 2013. p. 571–85.
Park S, Kim T. Forecasting audience of motion pictures considering competitive environment. J Theor Appl Inf Technol. 2017;95(18):4340–8.
Quader N, Gani MO, Chaki D, Ali MH. A machine learning approach to predict movie box-office success. In: 20th international conference of computer and information technology, ICCIT 2017, vols. 2018-Janua. Institute of Electrical and Electronics Engineers Inc.; 2018. p. 1–7
Quader N, Osman Gani Md, Chaki D, Haider Ali Md. A machine learning approach to predict movie box-office success. In: 2017 20th international conference of computer and information technology, ICCIT 2017. IEEE; 2017.
Riwinoto MT, Zega SA, Irlanda G. Predicting animated film of box-office success with neural networks. J Teknol. 2015;77(23):77–82.
Ru Y, Bo Li, Liu J, Chai J. An effective daily box office prediction model based on deep neural networks. Cognit. Syst Res. 2018;52:182–91.
Ruhrländer RP, Boissier M, Uflacker M. Improving box office result predictions for movies using consumer-centric models. In: KDD ’18 proceedings of the 24th ACM SIGKDD international conference on knowledge discovery and data mining, KDD ’18. London: ACM New York, NY, USA; 2018. p. 655–64.
Sachdev S, Agrawal A, Bhendarkar S, Prasad BR, Agarwal S. Movie box-office gross revenue estimation. Adv Intell Syst Comput. 2018;709:9–17.
Wang W, Xiu J, Yang Z, Liu C. A deep learning model for predicting movie box office based on deep belief network, edited by S. Y. Tan Y. Tang Q. Advances in swarm intelligence. ICSI 2018. Lect Notes Comput Sci. 2018;10942:530–41.
Wang Y, Ru Y, Chai J. Time series clustering based on sparse subspace clustering algorithm and its application to daily box-office data analysis. Neural Comput Appl. 2018;31(9):4809–18.
Xiao J, Li X, Chen S, Zhao X. Meng Xu. An inside look into the complexity of box-office revenue prediction in China. Int J Distrib Sens Netw. 2017;13(1):1–14.
Yaakub MR, Li Y, Zhang J. Integration of sentiment analysis into customer relational model: the importance of feature ontology and synonym. Proc Technol. 2014;11:495–501.
Zhou Y, Zhang L, Yi Z. Predicting movie box-office revenues using deep neural networks. Neural Comput Appl. 2017;31(6):1855–65.
Acknowledgements
This work was supported in part by the Ministry of Higher Education, Malaysia, under Grant FRGS/1/2017/ICT02/UKM/02/4.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
On behalf of all authors, the corresponding author states that there is no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Ahmad, I.S., Bakar, A.A., Yaakub, M.R. et al. A Survey on Machine Learning Techniques in Movie Revenue Prediction. SN COMPUT. SCI. 1, 235 (2020). https://doi.org/10.1007/s42979-020-00249-1
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s42979-020-00249-1