Skip to main content

Early Prediction and Variable Importance of Certificate Accomplishment in a MOOC

  • Conference paper
  • First Online:
Book cover Digital Education: Out to the World and Back to the Campus (EMOOCs 2017)

Abstract

The emergence of MOOCs (Massive Open Online Courses) makes available big amounts of data about students’ interaction with online educational platforms. This allows for the possibility of making predictions about future learning outcomes of students based on these interactions. The prediction of certificate accomplishment can enable the early detection of students at risk, in order to perform interventions before it is too late. This study applies different machine learning techniques to predict which students are going to get a certificate during different timeframes. The purpose is to be able to analyze how the quality metrics change when the models have more data available. From the four machine learning techniques applied finally we choose a boosted trees model which provides stability in the prediction over the weeks with good quality metrics. We determine the variables that are most important for the prediction and how they change during the weeks of the course.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://www.edx.org/course/la-espana-de-el-quijote-uamx-quijote501x-0.

  2. 2.

    http://edx.readthedocs.io/projects/devdata/en/latest/.

References

  1. Aguiar, E., Lakkaraju, H., Bhanpuri, N., Miller, D., Yuhas, B., Addison, K.L.: Who, when, and why: a machine learning approach to prioritizing students at risk of not graduating high school on time. In: Proceedings of the Fifth International Conference on Learning Analytics and Knowledge, pp. 93–102. ACM (2015)

    Google Scholar 

  2. Alexandron, G., Ruipérez-Valiente, J.A., Chen, Z., Muñoz-Merino, P.J., Pritchard, D.E.: Copying@Scale: using harvesting accounts for collecting correct answers in a MOOC. Comput. Educ. 108, 96–114 (2017)

    Article  Google Scholar 

  3. Anozie, N., Junker, B.W.: Predicting end-of-year accountability assessment scores from monthly student records in an online tutoring system. In: Educational Data Mining: Papers from the AAAI Workshop. AAAI Press, Menlo Park (2006)

    Google Scholar 

  4. Breslow, L., Pritchard, D.E., DeBoer, J., Stump, G.S., Ho, A.D., Seaton, D.T.: Studying learning in the worldwide classroom: research into edX’s First MOOC. Res. Pract. Assess. 8, 13–25 (2013)

    Google Scholar 

  5. Claros, I., Cobos, R., Sandoval, G., Villanueva, M.: Creating MOOCs by UAMx: experiences and expectations. In: The Third European MOOCs Stakeholders Summit (eMOOC 2015), pp. 61–64 (2015)

    Google Scholar 

  6. Coleman, C.A., Seaton, D.T., Chuang, I.: Probabilistic use cases: discovering behavioral patterns for predicting certification. In: Proceedings of the Second (2015) ACM Conference on Learning@Scale, pp. 141–148. ACM (2015)

    Google Scholar 

  7. Elbadrawy, A., Studham, R.S., Karypis, G.: Collaborative multi-regression models for predicting students’ performance in course activities. In: Proceedings of the Fifth International Conference on Learning Analytics and Knowledge, pp. 103–107. ACM (2015)

    Google Scholar 

  8. Friedman, J.H.: Greedy function approximation: a gradient boosting machine. Ann. Stat. 29, 1189–1232 (2001)

    Article  MathSciNet  MATH  Google Scholar 

  9. Guo, S., Wu, W.: Modeling student learning outcomes in MOOCs. In: The 4th International Conference on Teaching, Assessment, and Learning for Engineering (2015)

    Google Scholar 

  10. Hill, P.: Emerging student patterns in MOOCs: a (revised) graphical view (2013)

    Google Scholar 

  11. Jordan, K.: MOOC completion rates: the data (2013). http://www.katyjordan.com/MOOCproject. Accessed 27 Aug 2014

  12. Kelly, K., Arroyo, I., Heffernan, N.: Using ITS generated data to predict standardized test scores. In: Educational Data Mining 2013 (2013)

    Google Scholar 

  13. Khalil, H., Ebner, M.: MOOCs completion rates and possible methods to improve retention-a literature review. In: World Conference on Educational Multimedia, Hypermedia and Telecommunications, no. 1, pp. 1305–1313 (2014)

    Google Scholar 

  14. Kloft, M., Stiehler, F., Zheng, Z., Pinkwart, N.: Predicting MOOC dropout over weeks using machine learning methods. In: Proceedings of the EMNLP 2014 Workshop on Analysis of Large Scale Social Interaction in MOOCs, pp. 60–65 (2014)

    Google Scholar 

  15. Muñoz-Merino, P.J., Molina, M.F., Muñoz-Organero, M., Kloos, C.D.: An adaptive and innovative question-driven competition-based intelligent tutoring system for learning. Expert Syst. Appl. 39(8), 6932–6948 (2012)

    Article  Google Scholar 

  16. Pardo, A., Mirriahi, N., Martinez-Maldonado, R., Jovanovic, J., Dawson, S., Gašević, D.: Generating actionable predictive models of academic performance. In: Proceedings of the Sixth International Conference on Learning Analytics and Knowledge, pp. 474–478. ACM (2016)

    Google Scholar 

  17. Ren, Z., Rangwala, H., Johri, A.: Predicting performance on MOOC assessments using multi-regression models. arXiv preprint arXiv:1605.02269 (2016)

  18. Ruipérez-Valiente, J.A., Muñoz-Merino, P.J., Delgado Kloos, C.: A predictive model of learning gains for a video and exercise intensive learning environment. In: Conati, C., Heffernan, N., Mitrovic, A., Verdejo, M.F. (eds.) AIED 2015. LNCS, vol. 9112, pp. 760–763. Springer, Cham (2015). doi:10.1007/978-3-319-19773-9_110

    Chapter  Google Scholar 

  19. Sinha, T., Jermann, P., Li, N., Dillenbourg, P.: Your click decides your fate: Inferring information processing and attrition behavior from MOOC video clickstream interactions. arXiv preprint arXiv:1407.7131 (2014)

  20. Tabaa, Y., Medouri, A.: LASyM: A learning analytics system for MOOCs. Int. J. Adv. Comput. Sci. Appl. (IJACSA) 4(5), 113–119 (2013)

    Google Scholar 

Download references

Acknowledgments

Work partially funded by the Madrid Regional Government with grant No. S2013/ICE-2715, the Spanish Ministry of Economy and Competitiveness projects RESET (TIN2014-53199-C3-1-R) and Flexor (TIN2014-52129-R) and the European Erasmus+ projects MOOC Maker (561533-EPP-1-2015-1-ES-EPPKA2-CBHE-JP) and SHEILA (562080-EPP-1-2015-BE-EPPKA3-PI-FORWARD). This research work was made possible thanks to Universidad Autónoma de Madrid, which provided us with the dataset, and to Prof. Pedro García, who was the instructor of the selected MOOC.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to José A. Ruipérez-Valiente .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Ruipérez-Valiente, J.A., Cobos, R., Muñoz-Merino, P.J., Andujar, Á., Delgado Kloos, C. (2017). Early Prediction and Variable Importance of Certificate Accomplishment in a MOOC. In: Delgado Kloos, C., Jermann, P., Pérez-Sanagustín, M., Seaton, D., White, S. (eds) Digital Education: Out to the World and Back to the Campus. EMOOCs 2017. Lecture Notes in Computer Science(), vol 10254. Springer, Cham. https://doi.org/10.1007/978-3-319-59044-8_31

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-59044-8_31

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-59043-1

  • Online ISBN: 978-3-319-59044-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics