ABSTRACT
We present a new method for automatically providing feedback for introductory programming problems. In order to use this method, we need a reference implementation of the assignment, and an error model consisting of potential corrections to errors that students might make. Using this information, the system automatically derives minimal corrections to student's incorrect solutions, providing them with a measure of exactly how incorrect a given solution was, as well as feedback about what they did wrong.
We introduce a simple language for describing error models in terms of correction rules, and formally define a rule-directed translation strategy that reduces the problem of finding minimal corrections in an incorrect program to the problem of synthesizing a correct program from a sketch. We have evaluated our system on thousands of real student attempts obtained from the Introduction to Programming course at MIT (6.00) and MITx (6.00x). Our results show that relatively simple error models can correct on average 64% of all incorrect submissions in our benchmark set.
- A. Adam and J.-P. H. Laurent. LAURA, A System to Debug Student Programs. Artif. Intell., 15(1-2):75--122, 1980.Google ScholarDigital Library
- U. Ahmed, S. Gulwani, and A. Karkare. Automatically generating problems and solutions for natural deduction. In IJCAI, 2013.Google ScholarDigital Library
- R. Alur, L. D'Antoni, S. Gulwani, D. Kini, and M. Viswanathan. Automated grading of dfa constructions. In IJCAI, 2013.Google ScholarDigital Library
- E. Andersen, S. Gulwani, and Z. Popovic. A trace-based framework for analyzing and synthesizing educational progressions. In CHI, 2013. Google ScholarDigital Library
- A. Arcuri. On the automation of fixing software bugs. In ICSE Companion, 2008. Google ScholarDigital Library
- T. Ball, M. Naik, and S. K. Rajamani. From symptom to cause: localizing errors in counterexample traces. In POPL, 2003. Google ScholarDigital Library
- M. S. Bernstein, G. Little, R. C. Miller, B. Hartmann, M. S. Ackerman, D. R. Karger, D. Crowell, and K. Panovich. Soylent: a word processor with a crowd inside. In UIST, 2010. Google ScholarDigital Library
- R. Bodík, S. Chandra, J. Galenson, D. Kimelman, N. Tung, S. Barman, and C. Rodarmor. Programming with angelic nondeterminism. In POPL, 2010. Google ScholarDigital Library
- C. Daly. Roboprof and an introductory computer programming course. ITiCSE, 1999. Google ScholarDigital Library
- V. Debroy and W. Wong. Using mutation to automatically suggest fixes for faulty programs. In ICST, 2010. Google ScholarDigital Library
- C. Douce, D. Livingstone, and J. Orwell. Automatic test-based assessment of programming: A review. J. Educ. Resour. Comput., 5(3), Sept. 2005. Google ScholarDigital Library
- P. Ertmer, J. Richardson, B. Belland, D. Camin, P. Connolly, G. Coulthard, K. Lei, and C. Mong. Using peer feedback to enhance the quality of student online postings: An exploratory study. Journal of Computer-Mediated Communication, 12(2):412--433, 2007.Google ScholarCross Ref
- R. G. Farrell, J. R. Anderson, and B. J. Reiser. An interactive computer-based tutor for lisp. In AAAI, 1984.Google Scholar
- S. Forrest, T. Nguyen, W. Weimer, and C. L. Goues. A genetic programming approach to automated software repair. In GECCO, 2009. Google ScholarDigital Library
- A. Groce, S. Chaki, D. Kroening, and O. Strichman. Error explanation with distance metrics. STTT, 8(3):229--247, 2006. Google ScholarDigital Library
- S. Gulwani, S. Srivastava, and R. Venkatesan. Program analysis as constraint solving. In PLDI, 2008. Google ScholarDigital Library
- S. Gulwani, V. A. Korthikanti, and A. Tiwari. Synthesizing geometry constructions. In PLDI, 2011. Google ScholarDigital Library
- S. Gulwani, W. R. Harris, and R. Singh. Spreadsheet data manipulation using examples. In CACM, 2012. Google ScholarDigital Library
- P. Hawkins, A. Aiken, K. Fisher, M. C. Rinard, and M. Sagiv. Data representation synthesis. In PLDI, 2011. Google ScholarDigital Library
- P. Hawkins, A. Aiken, K. Fisher, M. C. Rinard, and M. Sagiv. Concurrent data representation synthesis. In PLDI, 2012. Google ScholarDigital Library
- J. B. Hext and J.W.Winings. An automatic grading scheme for simple programming exercises. Commun. ACM, 12(5), May 1969. Google ScholarDigital Library
- D. Jackson and M. Usher. Grading student programs using assyst. SIGCSE, 1997. Google ScholarDigital Library
- B. Jobstmann, A. Griesmayer, and R. Bloem. Program repair as a game. In CAV, pages 226--238, 2005. Google ScholarDigital Library
- W. L. Johnson and E. Soloway. Proust: Knowledge-based program understanding. IEEE Trans. Software Eng., 11(3):267--275, 1985. Google ScholarDigital Library
- M. Jose and R. Majumdar. Cause clue clauses: error localization using maximum satisfiability. In PLDI, 2011. Google ScholarDigital Library
- U. Junker. QUICKXPLAIN: preferred explanations and relaxations for over-constrained problems. In AAAI, 2004. Google ScholarDigital Library
- R. Könighofer and R. P. Bloem. Automated error localization and correction for imperative programs. In FMCAD, 2011. Google ScholarDigital Library
- C. Kulkarni and S. R. Klemmer. Learning design wisdom by augmenting physical studio critique with online self-assessment. Technical report, Stanford University, 2012.Google Scholar
- V. Kuncak, M. Mayer, R. Piskac, and P. Suter. Complete functional synthesis. PLDI, 2010. Google ScholarDigital Library
- G. Little, L. B. Chilton, M. Goldman, and R. C. Miller. Turkit: human computation algorithms on mechanical turk. In UIST, 2010. Google ScholarDigital Library
- W. R. Murray. Automatic program debugging for intelligent tutoring systems. Computational Intelligence, 3:1--16, 1987.Google ScholarCross Ref
- W. Sack, E. Soloway, and P. Weingrad. From PROUST to CHIRON: Its design as iterative engineering: Intermediate results are important! In In J.H. Larkin and R.W. Chabay (Eds.), Computer-Assisted Instruction and Intelligent Tutoring Systems: Shared Goals and Complementary Approaches., pages 239--274, 1992.Google Scholar
- R. Singh and S. Gulwani. Learning semantic string transformations from examples. PVLDB, 5, 2012. Google ScholarDigital Library
- R. Singh and A. Solar-Lezama. Synthesizing data structure manipulations from storyboards. In SIGSOFT FSE, 2011. Google ScholarDigital Library
- R. Singh, S. Gulwani, and S. K. Rajamani. Automatically generating algebra problems. In AAAI, 2012.Google Scholar
- R. Singh, S. Gulwani, and A. Solar-Lezama. Automated semantic grading of programs. CoRR, abs/1204.1751, 2012.Google Scholar
- A. Solar-Lezama. Program Synthesis By Sketching. PhD thesis, EECS Dept., UC Berkeley, 2008. Google ScholarDigital Library
- A. Solar-Lezama, R. Rabbah, R. Bodik, and K. Ebcioglu. Programming by sketching for bit-streaming programs. In PLDI, 2005. Google ScholarDigital Library
- E. Soloway, B. P. Woolf, E. Rubin, and P. Barth. Meno-II: An Intelligent Tutoring System for Novice Programmers. In IJCAI, 1981. Google ScholarDigital Library
- S. Srivastava, S. Gulwani, and J. Foster. From program verification to program synthesis. POPL, 2010. Google ScholarDigital Library
- S. S. Staber, B. Jobstmann, and R. P. Bloem. Finding and fixing faults. In Correct Hardware Design and Verification Methods, Lecture notes in computer science, pages 35--49, 2005. Google ScholarDigital Library
- M. Vechev, E. Yahav, and G. Yorsh. Abstraction-guided synthesis of synchronization. In POPL, 2010. Google ScholarDigital Library
- D. S. Weld, E. Adar, L. Chilton, R. Hoffmann, and E. Horvitz. Personalized online education - a crowdsourcing challenge. In Workshops at the Twenty-Sixth AAAI Conference on Artificial Intelligence, 2012.Google Scholar
- A. Zeller and R. Hildebrandt. Simplifying and isolating failureinducing input. IEEE Transactions on Software Engineering, 28:183--200, 2002. Google ScholarDigital Library
Recommendations
Automatic Grading and Feedback using Program Repair for Introductory Programming Courses
ITiCSE '17: Proceedings of the 2017 ACM Conference on Innovation and Technology in Computer Science EducationWe present GradeIT, a system that combines the dual objectives of automated grading and program repairing for introductory programming courses (CS1). Syntax errors pose a significant challenge for testcase-based grading as it is difficult to ...
Automated feedback generation for introductory programming assignments
PLDI '13We present a new method for automatically providing feedback for introductory programming problems. In order to use this method, we need a reference implementation of the assignment, and an error model consisting of potential corrections to errors that ...
Comments