ABSTRACT
Online platforms to learn programming are very popular nowadays. These platforms must automatically assess codes submitted by the learners and must provide good quality feedbacks in order to support their learning. Classical techniques to produce useful feedbacks include using unit testing frameworks to perform systematic functional tests of the submitted codes or using code quality assessment tools. This paper explores how to automatically identify error classes by clustering a set of submitted codes, using code plagiarism detection tools to measure the similarity between the codes. The proposed approach and analysis framework are presented in the paper, along with a first experiment using the Code Hunt dataset.
- J. Bishop, R. N. Horspool, T. Xie, N. Tillmann, and J. de Halleux. Code hunt: Experience with coding contests at scale. In Proceedings of the 37th International Conference on Software Engineering (ICSE 2015), pages 398–407. ACM, May 2015. Google ScholarDigital Library
- S. Combéfis, A. Bibal, and P. Van Roy. Recasting a traditional course into a mooc by means of a spoc. In Proceedings of the European MOOCs Stakeholders Summit 2014 (EMOOCs 2014), pages 205–208, Feb. 2014.Google Scholar
- S. Combéfis and V. le Clément de Saint-Marcq. Teaching programming and algorithm design with pythia, a web-based learning platform. Olympiads in Informatics, 6:31–43, 2012.Google Scholar
- S. Combéfis and A. Paques. Pythia reloaded: an intelligent unit testing-based code grader for education. In Proceedings of the 1st Int’l Code Hunt Workshop on Educational Software Engineering (CHESE 2015), pages 5–8. ACM, July 2015. Google ScholarDigital Library
- S. Combéfis and J. Wautelet. Programming trainings and informatics teaching through online contests. Olympiads in Informatics, 8:21–34, 2014.Google Scholar
- C. Douce, D. Livingstone, and J. Orwell. Automatic test-based assessment of programming: A review. Journal on Educational Resources in Computing, 5(3), Sept. 2005. Google ScholarDigital Library
- E. L. Glassman, J. Scott, R. Singh, P. J. Guo, and R. C. Miller. OverCode: Visualizing variation in student solutions to programming problems at scale. ACM Transactions on Computer-Human Interaction, 22(2):7:1–7:35, Apr. 2015. Google ScholarDigital Library
- E. L. Glassman, R. Singh, and R. C. Miller. Feature engineering for clustering student solutions. In Proceedings of the 1st ACM Conference on Learning at Scale (L@S 2014), pages 171–172. ACM, Mar. 2014. Google ScholarDigital Library
- J. Hage, P. Rademaker, and N. van Vugt. A comparison of plagiarism detection tools. Technical Report UU-CS-2010-015, Utrecht University, June 2010.Google Scholar
- P. Ihantola, T. Ahoniemi, V. Karavirta, and O. Seppälä. Review of recent systems for automatic assessment of programming assignments. In Proceedings of the 10th Koli Calling International Conference on Computing Education Research (Koli Calling 2010), pages 86–93. ACM, Oct. 2010. Google ScholarDigital Library
- M. Joy and M. Luck. Plagiarism in programming assignments. IEEE Transactions on Education, 42(2):129–133, May 1999. Google ScholarDigital Library
- L. Kaufman and P. J. Rousseeuw. Finding Groups in Data: An Introduction to Cluster Analysis. John Wiley & Sons, Inc., 2009.Google Scholar
- V. T. Martins, D. Fonte, P. R. Henriques, and D. da Cruz. Plagiarism detection: A tool survey and comparison. In Proceedings of the 3rd Symposium on Languages, Applications and Technologies (SLATE 2014), pages 143–158, 2014.Google Scholar
- F. Murtagh. Multidimensional Clustering Algorithms. Physica-Verlag, 1985.Google Scholar
- K. A. Naudé, J. H. Greyling, and D. Vogts. Marking student programs using graph similarity. Computers & Education, 54(2):545–561, Feb. 2010. Google ScholarDigital Library
- M. Novak. Review of source-code plagiarism detection in academia. In Proceedings of the 39th International Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO 2016), pages 901–906, June 2016.Google ScholarCross Ref
- V. Pieterse. Automated assessment of programming assignments. In Proceedings of the 3rd Computer Science Education Research Conference (CSERC 2013), pages 45–56. ACM, Apr. 2013. Google ScholarDigital Library
- L. Prechelt, G. Malpohl, and M. Philippsen. Finding plagiarisms among a set of programs with jplag. Journal of Universal Computer Science, 8(11):1016–1038, Nov. 2002.Google Scholar
- S. Schleimer, D. S. Wilkerson, and A. Aiken. Winnowing: Local algorithms for document fingerprinting. In Proceedings of the 2003 ACM SIGMOD International Conference on Management of Data (SIGMOD 2003), pages 76–85. ACM, June 2003. Google ScholarDigital Library
- R. Singh, S. Gulwani, and A. Solar-Lezama. Automated feedback generation for introductory programming assignments. In Proceedings of the 34th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI 2013), pages 15–26. ACM, June 2013. Google ScholarDigital Library
- J. Sitthiworachart and M. Joy. Effective peer assessment for learning computer programming. In Proceedings of the 9th Annual SIGCSE Conference on Innovation and Technology in Computer Science Education (ITiCSE 2004), pages 122–126. ACM, June 2004. Google ScholarDigital Library
- T. Staubitz, H. Klement, J. Renz, R. Teusner, and C. Meinel. Towards practical programming exercises and automated assessment in massive open online courses. In Proceedings of the 2015 Annual IEEE International Conference on Teaching, Assessment, and Learning for Engineering (TALE 2015), pages 23–30. IEEE, Dec. 2015.Google ScholarCross Ref
- T. Wang, X. Su, Y. Wang, and P. Ma. Semantic similarity-based grading of student programs. Information and Software Technology, 49(2):99–107, Feb. 2007. Google ScholarDigital Library
- Y. Wang, H. Li, Y. Feng, Y. Jiang, and Y. Liu. Assessment of programming language learning based on peer code review model: Implementation and experience report. Computers & Education, 59(2):412–422, Sept. 2014. Google ScholarDigital Library
Index Terms
- Automatic programming error class identification with code plagiarism-based clustering
Recommendations
Automatic Clustering of Different Solutions to Programming Assignments in Computing Education
CompEd '19: Proceedings of the ACM Conference on Global Computing EducationA computer programming assignment may have various solutions, and extracting them is of great significance for both teaching and learning. However, it could be challenging for instructors and students to identify the differences between those solutions ...
Fine-Grained Analysis of Similar Code Snippets
Reuse and Software QualityAbstractCode recommendation aims to help programmers in their coding endeavors by suggesting appropriate code snippets to complete their program. Code recommendation approaches such as code search or code repair may rely on code snippets or code templates ...
CRSearcher: Searching Code Database for Repairing Bugs
Internetware '17: Proceedings of the 9th Asia-Pacific Symposium on InternetwareWith the exponentially rising of software development in the past decades, millions of software products have been created. Existing empirical studies show that many code snippets are similar. Although there exist many difficulties in maintaining these ...
Comments