skip to main content
10.1145/2993270.2993271acmconferencesArticle/Chapter ViewAbstractPublication PagesisstaConference Proceedingsconference-collections
research-article

Automatic programming error class identification with code plagiarism-based clustering

Published:14 November 2016Publication History

ABSTRACT

Online platforms to learn programming are very popular nowadays. These platforms must automatically assess codes submitted by the learners and must provide good quality feedbacks in order to support their learning. Classical techniques to produce useful feedbacks include using unit testing frameworks to perform systematic functional tests of the submitted codes or using code quality assessment tools. This paper explores how to automatically identify error classes by clustering a set of submitted codes, using code plagiarism detection tools to measure the similarity between the codes. The proposed approach and analysis framework are presented in the paper, along with a first experiment using the Code Hunt dataset.

References

  1. J. Bishop, R. N. Horspool, T. Xie, N. Tillmann, and J. de Halleux. Code hunt: Experience with coding contests at scale. In Proceedings of the 37th International Conference on Software Engineering (ICSE 2015), pages 398–407. ACM, May 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. S. Combéfis, A. Bibal, and P. Van Roy. Recasting a traditional course into a mooc by means of a spoc. In Proceedings of the European MOOCs Stakeholders Summit 2014 (EMOOCs 2014), pages 205–208, Feb. 2014.Google ScholarGoogle Scholar
  3. S. Combéfis and V. le Clément de Saint-Marcq. Teaching programming and algorithm design with pythia, a web-based learning platform. Olympiads in Informatics, 6:31–43, 2012.Google ScholarGoogle Scholar
  4. S. Combéfis and A. Paques. Pythia reloaded: an intelligent unit testing-based code grader for education. In Proceedings of the 1st Int’l Code Hunt Workshop on Educational Software Engineering (CHESE 2015), pages 5–8. ACM, July 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. S. Combéfis and J. Wautelet. Programming trainings and informatics teaching through online contests. Olympiads in Informatics, 8:21–34, 2014.Google ScholarGoogle Scholar
  6. C. Douce, D. Livingstone, and J. Orwell. Automatic test-based assessment of programming: A review. Journal on Educational Resources in Computing, 5(3), Sept. 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. E. L. Glassman, J. Scott, R. Singh, P. J. Guo, and R. C. Miller. OverCode: Visualizing variation in student solutions to programming problems at scale. ACM Transactions on Computer-Human Interaction, 22(2):7:1–7:35, Apr. 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. E. L. Glassman, R. Singh, and R. C. Miller. Feature engineering for clustering student solutions. In Proceedings of the 1st ACM Conference on Learning at Scale (L@S 2014), pages 171–172. ACM, Mar. 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. J. Hage, P. Rademaker, and N. van Vugt. A comparison of plagiarism detection tools. Technical Report UU-CS-2010-015, Utrecht University, June 2010.Google ScholarGoogle Scholar
  10. P. Ihantola, T. Ahoniemi, V. Karavirta, and O. Seppälä. Review of recent systems for automatic assessment of programming assignments. In Proceedings of the 10th Koli Calling International Conference on Computing Education Research (Koli Calling 2010), pages 86–93. ACM, Oct. 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. M. Joy and M. Luck. Plagiarism in programming assignments. IEEE Transactions on Education, 42(2):129–133, May 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. L. Kaufman and P. J. Rousseeuw. Finding Groups in Data: An Introduction to Cluster Analysis. John Wiley & Sons, Inc., 2009.Google ScholarGoogle Scholar
  13. V. T. Martins, D. Fonte, P. R. Henriques, and D. da Cruz. Plagiarism detection: A tool survey and comparison. In Proceedings of the 3rd Symposium on Languages, Applications and Technologies (SLATE 2014), pages 143–158, 2014.Google ScholarGoogle Scholar
  14. F. Murtagh. Multidimensional Clustering Algorithms. Physica-Verlag, 1985.Google ScholarGoogle Scholar
  15. K. A. Naudé, J. H. Greyling, and D. Vogts. Marking student programs using graph similarity. Computers & Education, 54(2):545–561, Feb. 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. M. Novak. Review of source-code plagiarism detection in academia. In Proceedings of the 39th International Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO 2016), pages 901–906, June 2016.Google ScholarGoogle ScholarCross RefCross Ref
  17. V. Pieterse. Automated assessment of programming assignments. In Proceedings of the 3rd Computer Science Education Research Conference (CSERC 2013), pages 45–56. ACM, Apr. 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. L. Prechelt, G. Malpohl, and M. Philippsen. Finding plagiarisms among a set of programs with jplag. Journal of Universal Computer Science, 8(11):1016–1038, Nov. 2002.Google ScholarGoogle Scholar
  19. S. Schleimer, D. S. Wilkerson, and A. Aiken. Winnowing: Local algorithms for document fingerprinting. In Proceedings of the 2003 ACM SIGMOD International Conference on Management of Data (SIGMOD 2003), pages 76–85. ACM, June 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. R. Singh, S. Gulwani, and A. Solar-Lezama. Automated feedback generation for introductory programming assignments. In Proceedings of the 34th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI 2013), pages 15–26. ACM, June 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. J. Sitthiworachart and M. Joy. Effective peer assessment for learning computer programming. In Proceedings of the 9th Annual SIGCSE Conference on Innovation and Technology in Computer Science Education (ITiCSE 2004), pages 122–126. ACM, June 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. T. Staubitz, H. Klement, J. Renz, R. Teusner, and C. Meinel. Towards practical programming exercises and automated assessment in massive open online courses. In Proceedings of the 2015 Annual IEEE International Conference on Teaching, Assessment, and Learning for Engineering (TALE 2015), pages 23–30. IEEE, Dec. 2015.Google ScholarGoogle ScholarCross RefCross Ref
  23. T. Wang, X. Su, Y. Wang, and P. Ma. Semantic similarity-based grading of student programs. Information and Software Technology, 49(2):99–107, Feb. 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Y. Wang, H. Li, Y. Feng, Y. Jiang, and Y. Liu. Assessment of programming language learning based on peer code review model: Implementation and experience report. Computers & Education, 59(2):412–422, Sept. 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Automatic programming error class identification with code plagiarism-based clustering

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          CHESE 2016: Proceedings of the 2nd International Code Hunt Workshop on Educational Software Engineering
          November 2016
          8 pages
          ISBN:9781450344029
          DOI:10.1145/2993270

          Copyright © 2016 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 14 November 2016

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article

          Upcoming Conference

          ISSTA '24

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader