Abstract
The emergence of MOOCs (Massive Open Online Courses) makes available big amounts of data about students’ interaction with online educational platforms. This allows for the possibility of making predictions about future learning outcomes of students based on these interactions. The prediction of certificate accomplishment can enable the early detection of students at risk, in order to perform interventions before it is too late. This study applies different machine learning techniques to predict which students are going to get a certificate during different timeframes. The purpose is to be able to analyze how the quality metrics change when the models have more data available. From the four machine learning techniques applied finally we choose a boosted trees model which provides stability in the prediction over the weeks with good quality metrics. We determine the variables that are most important for the prediction and how they change during the weeks of the course.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Aguiar, E., Lakkaraju, H., Bhanpuri, N., Miller, D., Yuhas, B., Addison, K.L.: Who, when, and why: a machine learning approach to prioritizing students at risk of not graduating high school on time. In: Proceedings of the Fifth International Conference on Learning Analytics and Knowledge, pp. 93–102. ACM (2015)
Alexandron, G., Ruipérez-Valiente, J.A., Chen, Z., Muñoz-Merino, P.J., Pritchard, D.E.: Copying@Scale: using harvesting accounts for collecting correct answers in a MOOC. Comput. Educ. 108, 96–114 (2017)
Anozie, N., Junker, B.W.: Predicting end-of-year accountability assessment scores from monthly student records in an online tutoring system. In: Educational Data Mining: Papers from the AAAI Workshop. AAAI Press, Menlo Park (2006)
Breslow, L., Pritchard, D.E., DeBoer, J., Stump, G.S., Ho, A.D., Seaton, D.T.: Studying learning in the worldwide classroom: research into edX’s First MOOC. Res. Pract. Assess. 8, 13–25 (2013)
Claros, I., Cobos, R., Sandoval, G., Villanueva, M.: Creating MOOCs by UAMx: experiences and expectations. In: The Third European MOOCs Stakeholders Summit (eMOOC 2015), pp. 61–64 (2015)
Coleman, C.A., Seaton, D.T., Chuang, I.: Probabilistic use cases: discovering behavioral patterns for predicting certification. In: Proceedings of the Second (2015) ACM Conference on Learning@Scale, pp. 141–148. ACM (2015)
Elbadrawy, A., Studham, R.S., Karypis, G.: Collaborative multi-regression models for predicting students’ performance in course activities. In: Proceedings of the Fifth International Conference on Learning Analytics and Knowledge, pp. 103–107. ACM (2015)
Friedman, J.H.: Greedy function approximation: a gradient boosting machine. Ann. Stat. 29, 1189–1232 (2001)
Guo, S., Wu, W.: Modeling student learning outcomes in MOOCs. In: The 4th International Conference on Teaching, Assessment, and Learning for Engineering (2015)
Hill, P.: Emerging student patterns in MOOCs: a (revised) graphical view (2013)
Jordan, K.: MOOC completion rates: the data (2013). http://www.katyjordan.com/MOOCproject. Accessed 27 Aug 2014
Kelly, K., Arroyo, I., Heffernan, N.: Using ITS generated data to predict standardized test scores. In: Educational Data Mining 2013 (2013)
Khalil, H., Ebner, M.: MOOCs completion rates and possible methods to improve retention-a literature review. In: World Conference on Educational Multimedia, Hypermedia and Telecommunications, no. 1, pp. 1305–1313 (2014)
Kloft, M., Stiehler, F., Zheng, Z., Pinkwart, N.: Predicting MOOC dropout over weeks using machine learning methods. In: Proceedings of the EMNLP 2014 Workshop on Analysis of Large Scale Social Interaction in MOOCs, pp. 60–65 (2014)
Muñoz-Merino, P.J., Molina, M.F., Muñoz-Organero, M., Kloos, C.D.: An adaptive and innovative question-driven competition-based intelligent tutoring system for learning. Expert Syst. Appl. 39(8), 6932–6948 (2012)
Pardo, A., Mirriahi, N., Martinez-Maldonado, R., Jovanovic, J., Dawson, S., Gašević, D.: Generating actionable predictive models of academic performance. In: Proceedings of the Sixth International Conference on Learning Analytics and Knowledge, pp. 474–478. ACM (2016)
Ren, Z., Rangwala, H., Johri, A.: Predicting performance on MOOC assessments using multi-regression models. arXiv preprint arXiv:1605.02269 (2016)
Ruipérez-Valiente, J.A., Muñoz-Merino, P.J., Delgado Kloos, C.: A predictive model of learning gains for a video and exercise intensive learning environment. In: Conati, C., Heffernan, N., Mitrovic, A., Verdejo, M.F. (eds.) AIED 2015. LNCS, vol. 9112, pp. 760–763. Springer, Cham (2015). doi:10.1007/978-3-319-19773-9_110
Sinha, T., Jermann, P., Li, N., Dillenbourg, P.: Your click decides your fate: Inferring information processing and attrition behavior from MOOC video clickstream interactions. arXiv preprint arXiv:1407.7131 (2014)
Tabaa, Y., Medouri, A.: LASyM: A learning analytics system for MOOCs. Int. J. Adv. Comput. Sci. Appl. (IJACSA) 4(5), 113–119 (2013)
Acknowledgments
Work partially funded by the Madrid Regional Government with grant No. S2013/ICE-2715, the Spanish Ministry of Economy and Competitiveness projects RESET (TIN2014-53199-C3-1-R) and Flexor (TIN2014-52129-R) and the European Erasmus+ projects MOOC Maker (561533-EPP-1-2015-1-ES-EPPKA2-CBHE-JP) and SHEILA (562080-EPP-1-2015-BE-EPPKA3-PI-FORWARD). This research work was made possible thanks to Universidad Autónoma de Madrid, which provided us with the dataset, and to Prof. Pedro García, who was the instructor of the selected MOOC.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Ruipérez-Valiente, J.A., Cobos, R., Muñoz-Merino, P.J., Andujar, Á., Delgado Kloos, C. (2017). Early Prediction and Variable Importance of Certificate Accomplishment in a MOOC. In: Delgado Kloos, C., Jermann, P., Pérez-Sanagustín, M., Seaton, D., White, S. (eds) Digital Education: Out to the World and Back to the Campus. EMOOCs 2017. Lecture Notes in Computer Science(), vol 10254. Springer, Cham. https://doi.org/10.1007/978-3-319-59044-8_31
Download citation
DOI: https://doi.org/10.1007/978-3-319-59044-8_31
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-59043-1
Online ISBN: 978-3-319-59044-8
eBook Packages: Computer ScienceComputer Science (R0)