research-article

Automatic programming error class identification with code plagiarism-based clustering

Authors:
Sébastien Combéfis

École Centrale des Arts et Métiers, Belgium

École Centrale des Arts et Métiers, Belgium
View Profile

,
Arnaud Schils

Université Catholique de Louvain, Belgium

Université Catholique de Louvain, Belgium
View Profile

CHESE 2016: Proceedings of the 2nd International Code Hunt Workshop on Educational Software EngineeringNovember 2016Pages 1–6https://doi.org/10.1145/2993270.2993271

Published:14 November 2016Publication History

CHESE 2016: Proceedings of the 2nd International Code Hunt Workshop on Educational Software Engineering

Pages 1–6

ABSTRACT

Online platforms to learn programming are very popular nowadays. These platforms must automatically assess codes submitted by the learners and must provide good quality feedbacks in order to support their learning. Classical techniques to produce useful feedbacks include using unit testing frameworks to perform systematic functional tests of the submitted codes or using code quality assessment tools. This paper explores how to automatically identify error classes by clustering a set of submitted codes, using code plagiarism detection tools to measure the similarity between the codes. The proposed approach and analysis framework are presented in the paper, along with a first experiment using the Code Hunt dataset.

References

J. Bishop, R. N. Horspool, T. Xie, N. Tillmann, and J. de Halleux. Code hunt: Experience with coding contests at scale. In Proceedings of the 37th International Conference on Software Engineering (ICSE 2015), pages 398–407. ACM, May 2015. Google ScholarDigital Library
S. Combéfis, A. Bibal, and P. Van Roy. Recasting a traditional course into a mooc by means of a spoc. In Proceedings of the European MOOCs Stakeholders Summit 2014 (EMOOCs 2014), pages 205–208, Feb. 2014.Google Scholar
S. Combéfis and V. le Clément de Saint-Marcq. Teaching programming and algorithm design with pythia, a web-based learning platform. Olympiads in Informatics, 6:31–43, 2012.Google Scholar
S. Combéfis and A. Paques. Pythia reloaded: an intelligent unit testing-based code grader for education. In Proceedings of the 1st Int’l Code Hunt Workshop on Educational Software Engineering (CHESE 2015), pages 5–8. ACM, July 2015. Google ScholarDigital Library
S. Combéfis and J. Wautelet. Programming trainings and informatics teaching through online contests. Olympiads in Informatics, 8:21–34, 2014.Google Scholar
C. Douce, D. Livingstone, and J. Orwell. Automatic test-based assessment of programming: A review. Journal on Educational Resources in Computing, 5(3), Sept. 2005. Google ScholarDigital Library
E. L. Glassman, J. Scott, R. Singh, P. J. Guo, and R. C. Miller. OverCode: Visualizing variation in student solutions to programming problems at scale. ACM Transactions on Computer-Human Interaction, 22(2):7:1–7:35, Apr. 2015. Google ScholarDigital Library
E. L. Glassman, R. Singh, and R. C. Miller. Feature engineering for clustering student solutions. In Proceedings of the 1st ACM Conference on Learning at Scale (L@S 2014), pages 171–172. ACM, Mar. 2014. Google ScholarDigital Library
J. Hage, P. Rademaker, and N. van Vugt. A comparison of plagiarism detection tools. Technical Report UU-CS-2010-015, Utrecht University, June 2010.Google Scholar
P. Ihantola, T. Ahoniemi, V. Karavirta, and O. Seppälä. Review of recent systems for automatic assessment of programming assignments. In Proceedings of the 10th Koli Calling International Conference on Computing Education Research (Koli Calling 2010), pages 86–93. ACM, Oct. 2010. Google ScholarDigital Library
M. Joy and M. Luck. Plagiarism in programming assignments. IEEE Transactions on Education, 42(2):129–133, May 1999. Google ScholarDigital Library
L. Kaufman and P. J. Rousseeuw. Finding Groups in Data: An Introduction to Cluster Analysis. John Wiley & Sons, Inc., 2009.Google Scholar
V. T. Martins, D. Fonte, P. R. Henriques, and D. da Cruz. Plagiarism detection: A tool survey and comparison. In Proceedings of the 3rd Symposium on Languages, Applications and Technologies (SLATE 2014), pages 143–158, 2014.Google Scholar
F. Murtagh. Multidimensional Clustering Algorithms. Physica-Verlag, 1985.Google Scholar
K. A. Naudé, J. H. Greyling, and D. Vogts. Marking student programs using graph similarity. Computers & Education, 54(2):545–561, Feb. 2010. Google ScholarDigital Library
M. Novak. Review of source-code plagiarism detection in academia. In Proceedings of the 39th International Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO 2016), pages 901–906, June 2016.Google ScholarCross Ref
V. Pieterse. Automated assessment of programming assignments. In Proceedings of the 3rd Computer Science Education Research Conference (CSERC 2013), pages 45–56. ACM, Apr. 2013. Google ScholarDigital Library
L. Prechelt, G. Malpohl, and M. Philippsen. Finding plagiarisms among a set of programs with jplag. Journal of Universal Computer Science, 8(11):1016–1038, Nov. 2002.Google Scholar
S. Schleimer, D. S. Wilkerson, and A. Aiken. Winnowing: Local algorithms for document fingerprinting. In Proceedings of the 2003 ACM SIGMOD International Conference on Management of Data (SIGMOD 2003), pages 76–85. ACM, June 2003. Google ScholarDigital Library
R. Singh, S. Gulwani, and A. Solar-Lezama. Automated feedback generation for introductory programming assignments. In Proceedings of the 34th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI 2013), pages 15–26. ACM, June 2013. Google ScholarDigital Library
J. Sitthiworachart and M. Joy. Effective peer assessment for learning computer programming. In Proceedings of the 9th Annual SIGCSE Conference on Innovation and Technology in Computer Science Education (ITiCSE 2004), pages 122–126. ACM, June 2004. Google ScholarDigital Library
T. Staubitz, H. Klement, J. Renz, R. Teusner, and C. Meinel. Towards practical programming exercises and automated assessment in massive open online courses. In Proceedings of the 2015 Annual IEEE International Conference on Teaching, Assessment, and Learning for Engineering (TALE 2015), pages 23–30. IEEE, Dec. 2015.Google ScholarCross Ref
T. Wang, X. Su, Y. Wang, and P. Ma. Semantic similarity-based grading of student programs. Information and Software Technology, 49(2):99–107, Feb. 2007. Google ScholarDigital Library
Y. Wang, H. Li, Y. Feng, Y. Jiang, and Y. Liu. Assessment of programming language learning based on peer code review model: Implementation and experience report. Computers & Education, 59(2):412–422, Sept. 2014. Google ScholarDigital Library

Index Terms

Automatic programming error class identification with code plagiarism-based clustering

Recommendations

Automatic Clustering of Different Solutions to Programming Assignments in Computing Education
CompEd '19: Proceedings of the ACM Conference on Global Computing Education

A computer programming assignment may have various solutions, and extracting them is of great significance for both teaching and learning. However, it could be challenging for instructors and students to identify the differences between those solutions ...
Read More
Fine-Grained Analysis of Similar Code Snippets
Reuse and Software Quality
Abstract
Code recommendation aims to help programmers in their coding endeavors by suggesting appropriate code snippets to complete their program. Code recommendation approaches such as code search or code repair may rely on code snippets or code templates ...
Read More
CRSearcher: Searching Code Database for Repairing Bugs
Internetware '17: Proceedings of the 9th Asia-Pacific Symposium on Internetware

With the exponentially rising of software development in the past decades, millions of software products have been created. Existing empirical studies show that many code snippets are similar. Although there exist many difficulties in maintaining these ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
CHESE 2016: Proceedings of the 2nd International Code Hunt Workshop on Educational Software Engineering
November 2016
8 pages
ISBN:9781450344029
DOI:10.1145/2993270
General Chairs:
Chang Liu,
Rishabh Singh
Copyright © 2016 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 14 November 2016
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Automatic Code Assessment
Code Similarity
Education
Qualifiers
- research-article
Conference
Upcoming Conference
ISSTA '24

Sponsor:

sigsoft

33rd ACM SIGSOFT International Symposium on Software Testing and Analysis

September 16 - 20, 2024

Vienna , Austria
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 8
  Total Citations
  View Citations
- 192
  Total Downloads
- Downloads (Last 12 months)10
- Downloads (Last 6 weeks)1
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Automatic programming error class identification with code plagiarism-based clustering

CHESE 2016: Proceedings of the 2nd International Code Hunt Workshop on Educational Software Engineering

ABSTRACT

References

Cited By

Index Terms

Recommendations

Automatic Clustering of Different Solutions to Programming Assignments in Computing Education

Fine-Grained Analysis of Similar Code Snippets

CRSearcher: Searching Code Database for Repairing Bugs

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Upcoming Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Automatic programming error class identification with code plagiarism-based clustering

CHESE 2016: Proceedings of the 2nd International Code Hunt Workshop on Educational Software Engineering

ABSTRACT

References

Cited By

Index Terms

Recommendations

Automatic Clustering of Different Solutions to Programming Assignments in Computing Education

Fine-Grained Analysis of Similar Code Snippets

CRSearcher: Searching Code Database for Repairing Bugs

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Upcoming Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media