research-article

OverCode: Visualizing Variation in Student Solutions to Programming Problems at Scale

Authors:
Elena L. Glassman

MIT CSAIL, Cambridge, MA, USA

MIT CSAIL, Cambridge, MA, USA
View Profile

,
Jeremy Scott

MIT CSAIL, Cambridge, MA, USA

MIT CSAIL, Cambridge, MA, USA
View Profile

,
Rishabh Singh

MIT CSAIL, Kirkland, WA, USA

MIT CSAIL, Kirkland, WA, USA
View Profile

,
Philip J. Guo

MIT CSAIL and University of Rochester, Rochester, New York, USA

MIT CSAIL and University of Rochester, Rochester, New York, USA
View Profile

,
Robert C. Miller

MIT CSAIL, Cambridge, MA, USA

MIT CSAIL, Cambridge, MA, USA
View Profile

Authors Info & Claims

ACM Transactions on Computer-Human Interaction Volume 22 Issue 2Article No.: 7pp 1–35https://doi.org/10.1145/2699751

Published:10 March 2015Publication History

ACM Transactions on Computer-Human Interaction

Abstract

In MOOCs, a single programming exercise may produce thousands of solutions from learners. Understanding solution variation is important for providing appropriate feedback to students at scale. The wide variation among these solutions can be a source of pedagogically valuable examples and can be used to refine the autograder for the exercise by exposing corner cases. We present OverCode, a system for visualizing and exploring thousands of programming solutions. OverCode uses both static and dynamic analysis to cluster similar solutions, and lets teachers further filter and cluster solutions based on different criteria. We evaluated OverCode against a nonclustering baseline in a within-subjects study with 24 teaching assistants and found that the OverCode interface allows teachers to more quickly develop a high-level view of students' understanding and misconceptions, and to provide feedback that is relevant to more students' solutions.

References

Sumit Basu, Chuck Jacobs, and Lucy Vanderwende. 2013. Powergrading: A clustering approach to amplify human effort for short answer grading. Transactions of the Association for Computational Linguistics 1, 391--402.Google ScholarCross Ref
Ira D. Baxter, Andrew Yahin, Leonardo Moura, Marcelo Sant'Anna, and Lorraine Bier. 1998. Clone detection using abstract syntax trees. In Proceedings of the International Conference on Software Maintenance (ICSM'98). IEEE, Los Alamitos, CA, 368--377. Google ScholarDigital Library
Michael Brooks, Sumit Basu, Charles Jacobs, and Lucy Vanderwende. 2014. Divide and correct: Using clusters to grade short answers at scale. In Proceedings of the 1st Conference on Learning @ Scale. 89--98. Google ScholarDigital Library
Matheus Gaudencio, Ayla Dantas, and Dalton D. S. Guerrero. 2014. Can computers compare student code solutions as well as teachers&quest; In Proceedings of the 45th ACM Technical Symposium on Computer Science Education (SIGCSE'14). ACM, New York, NY, 21--26. Google ScholarDigital Library
Elena L. Glassman, Ned Gulley, and Robert C. Miller. 2013. Toward facilitating assistance to students attempting engineering design problems. In Proceedings of the 10th Annual International ACM Conference on International Computing Education Research (ICER'13). ACM, New York, NY, 41--46. Google ScholarDigital Library
Philip J. Guo. 2013. Online Python tutor: Embeddable Web-based program visualization for CS education. In Proceeding of the 44th ACM Technical Symposium on Computer Science Education (SIGCSE'13). ACM, New York, NY, 579--584. Google ScholarDigital Library
Jonathan Huang, Chris Piech, Andy Nguyen, and Leonidas J. Guibas. 2013. Syntactic and functional variability of a million code submissions in a machine learning MOOC. In Proceedings of the 16th International Conference on Artificial Intelligence in Education Workshop on Massive Open Online Courses (MOOCshop).Google Scholar
Andrew Luxton-Reilly, Paul Denny, Diana Kirk, Ewan Tempero, and Se-Young Yu. 2013. On the differences between correct student solutions. In Proceedings of the 18th ACM Conference on Innovation and Technology in Computer Science Education (ITiCSE'13). ACM, New York, NY, 177--182. Google ScholarDigital Library
Ference Marton, Amy B. M. Tsui, Pakey P. M. Chik, Po Y. Ko, and Mun L. Lo. 2013. Classroom Discourse and the Space of Learning. Taylor and Francis.Google Scholar
Aditi Muralidharan and Marti Hearst. 2011. Wordseer: Exploring language use in literary text. In Proceedings of the 5th Workshop on Human-Computer Interaction and Information Retrieval.Google Scholar
Aditi Muralidharan and Marti A Hearst. 2013. Supporting exploratory text analysis in literature study. Literary and Linguistic Computing 28, 2, 283--295.Google ScholarCross Ref
Aditi S. Muralidharan, Marti A. Hearst, and Christopher Fan. 2013. Wordseer: A knowledge synthesis environment for textual data. In Proceedings of the 22nd ACM International Conference on Information and Knowledge Management (CIKM'13). 2533--2536. Google ScholarDigital Library
Andy Nguyen, Christopher Piech, Jonathan Huang, and Leonidas J. Guibas. 2014. Codewebs: Scalable homework search for massive open online programming courses. In Proceedings of the 23rd International Conference on World Wide Web (WWW'14). 491--502. Google ScholarDigital Library
Kelly Rivers and Kenneth R. Koedinger. 2013. Automatic generation of programming feedback: A data-driven approach. In Proceedings of the 1st Workshop on AI-Supported Education for Computer Science (AIEDCS'13).Google Scholar
Jeffrey M. Rzeszotarski and Aniket Kittur. 2012. CrowdScape: Interactively visualizing user behavior and output. In Proceedings of the 25th Annual ACM Symposium on User Interface Software and Technology (UIST'12). 55--62. Google ScholarDigital Library
Saul Schleimer, Daniel S. Wilkerson, and Alex Aiken. 2003. Winnowing: Local algorithms for document fingerprinting. In Proceedings of the 2003 ACM SIGMOD International Conference on Management of Data. ACM, New York, NY, 76--85. Google ScholarDigital Library
Dennis Shasha, Jason T.-L. Wang, Kaizhong Zhang, and Frank Y. Shih. 1994. Exact and approximate algorithms for unordered tree matching. IEEE Transactions on Systems, Man, and Cybernetics 24, 4, 668--678.Google ScholarCross Ref
Rishabh Singh, Sumit Gulwani, and Armando Solar-Lezama. 2013. Automated feedback generation for introductory programming assignments. In Proceedings of the 34th ACM SIGPLAN Conference on Programming Language Design and Implementation. 15--26. Google ScholarDigital Library
Elliot Soloway and Kate Ehrlich. 1984. Empirical studies of programming knowledge. IEEE Transactions on Software Engineering 10, 5, 595--609. Google ScholarDigital Library
Ahmad Taherkhani, Ari Korhonen, and Lauri Malmi. 2012. Automatic recognition of students' sorting algorithm implementations in a data structures and algorithms course. In Proceedings of the 12th Koli Calling International Conference on Computing Education Research. ACM, New York, NY, 83--92. Google ScholarDigital Library
Ahmad Taherkhani and Lauri Malmi. 2013. Beacon-and schema-based method for recognizing algorithms from students' source code. Journal of Educational Data Mining 5, 2, 69--101.Google Scholar
Songwen Xu and Yam San Chee. 2003. Transformation-based diagnosis of student programs for programming tutoring systems. IEEE Transactions on Software Engineering 29, 4, 360--384. Google ScholarDigital Library

Index Terms

OverCode: Visualizing Variation in Student Solutions to Programming Problems at Scale

Recommendations

Codeopticon: Real-Time, One-To-Many Human Tutoring for Computer Programming
UIST '15: Proceedings of the 28th Annual ACM Symposium on User Interface Software & Technology

One-on-one tutoring from a human expert is an effective way for novices to overcome learning barriers in complex domains such as computer programming. But there are usually far fewer experts than learners. To enable a single expert to help more learners ...
Read More
OverCode: visualizing variation in student solutions to programming problems at scale
UIST '14 Adjunct: Adjunct Proceedings of the 27th Annual ACM Symposium on User Interface Software and Technology

In MOOCs, a single programming exercise may produce thousands of solutions from learners. Understanding solution variation is important for providing appropriate feedback to students at scale. The wide variation among these solutions can be a source of ...
Read More
Learner Model for Adaptive Scaffolding in Intelligent Tutoring Systems for Organizing Programming Knowledge
Human Interface and the Management of Information. Information-Rich and Intelligent Environments
Abstract
Learner models are constructed from learner understanding states regarding learning materials, obtained by observing learner behaviors. A learner model collects information for adaptive tutoring and uses this information to determine system ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
ACM Transactions on Computer-Human Interaction Volume 22, Issue 2
Special Issue on Online Learning at Scale
April 2015
133 pages
ISSN:1073-0516
EISSN:1557-7325
DOI:10.1145/2744768
Editor:
Shumin Zhai
Google, Inc.
Issue’s Table of Contents
Copyright © 2015 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 10 March 2015
- Revised: 1 October 2014
- Accepted: 1 October 2014
- Received: 1 June 2014
Published in tochi Volume 22, Issue 2

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Programming education
learning at scale
Qualifiers
- research-article
- Research
- Refereed
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 3
  Total Citations
  View Citations
- 1,549
  Total Downloads
- Downloads (Last 12 months)167
- Downloads (Last 6 weeks)27
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

OverCode: Visualizing Variation in Student Solutions to Programming Problems at Scale

ACM Transactions on Computer-Human Interaction

Abstract

References

Cited By

Index Terms

Recommendations

Codeopticon: Real-Time, One-To-Many Human Tutoring for Computer Programming

OverCode: visualizing variation in student solutions to programming problems at scale

Learner Model for Adaptive Scaffolding in Intelligent Tutoring Systems for Organizing Programming Knowledge