ABSTRACT
In comparison to multiple choice or other recognition-oriented forms of assessment, short answer questions have been shown to offer greater value for both students and teachers; for students they can improve retention of knowledge, while for teachers they provide more insight into student understanding. Unfortunately, the same open-ended nature which makes them so valuable also makes them more difficult to grade at scale. To address this, we propose a cluster-based interface that allows teachers to read, grade, and provide feedback on large groups of answers at once. We evaluated this interface against an unclustered baseline in a within-subjects study with 25 teachers, and found that the clustered interface allows teachers to grade substantially faster, to give more feedback to students, and to develop a high-level view of students' understanding and misconceptions.
- Anderson, R. and Biddle, W. On asking people questions about what they are reading. Psychology of learning and motivation 9, (1975).Google Scholar
- Basu, S., Jacobs, C., and Vanderwende, L. Powergrading: a Clustering Approach to Amplify Human Effort for Short Answer Grading. TACL 1, (2013), 391--402.Google ScholarCross Ref
- Bloom, B. The 2 sigma problem: The search for methods of group instruction as effective as one-to-one tutoring. Edu. Researcher 13, 6 (1984), 4--16.Google Scholar
- Brookhart, S. Teachers' grading: Practice and theory. Applied Measurement in Edu., (1994).Google Scholar
- Cross, L. and Frary, R. Hodgepodge grading: Endorsed by students and teachers alike. Applied Measurement in Edu., October 2013 (1999), 37--41.Google Scholar
- Hearst, M. The debate on automated essay grading. Intelligent Systems and their Applications, (2000).Google Scholar
- Heywood, J. Assessment in higher education: Student learning, teaching, programmes and institutions. 2000.Google Scholar
- Jordan, S. and Mitchell, T. e-Assessment for learning? The potential of short-answer free-text questions with tailored feedback. British Journal of Edu. Tech. 40, 2 (2009), 371--385.Google Scholar
- Karpicke, J.D. and Roediger, H.L. The critical importance of retrieval for learning. Science 319, 5865 (2008), 966--8.Google ScholarCross Ref
- Markoff, J. Essay-Grading Software Offers Professors a Break. The New York Times, 2013.Google Scholar
- McMillan, J. Secondary teachers' classroom assessment and grading practices. Edu. Measurement: Issues and Practice, (2001).Google Scholar
- Mohler, M.A.G., Bunescu, R., and Mihalcea, R. Learning to Grade Short Answer Questions using Semantic Similarity Measures and Dependency Graph Alignments. Proc. ACL, (2011). Google ScholarDigital Library
- Mory, E. Feedback research revisited. In D.J. Mahwah, ed., Handbook of Research on Educational Communications and Technology. 2004, 745--784.Google Scholar
- Perelman, L. Critique (Ver. 3.4) of Mark D. Shermis & Ben Hammer, "Contrasting State-of-the-Art Automated Scoring of Essays: Analysis."2013.Google Scholar
- Piech, C., Huang, J., Chen, Z., Do, C., Ng, A., and Koller, D. Tuned Models of Peer Assessment in MOOCs. Proc. EDM, (2013).Google Scholar
- Poulos, A. and Mahony, M.J. Effectiveness of feedback: the students' perspective. Assessment & Evaluation in Higher Edu. 33, 2 (2008), 143--154.Google Scholar
- Reily, K., Finnerty, P.L., and Terveen, L. Two peers are better than one: aggregating peer reviews for computing assignments is surprisingly accurate. Proc. GROUP, ACM Press (2009), 115. Google ScholarDigital Library
- Sadler, P. and Good, E. The Impact of Self- and Peer-Grading on Student Learning. Edu. Assessment 11, 1 (2006), 1--31.Google ScholarCross Ref
- Scriven, M. The methodology of evaluation. In R.E. Stake, ed., AERSA Monograph Series on Curriculum Evaluation. Rand McNally, Chicago, 1967.Google Scholar
- Thorpe, M. Assessment and "third generation" distance education. Distance Edu. 19, 2 (1998), 265--286.Google ScholarCross Ref
- Weld, D., Adar, E., and Chilton, L. Personalized Online Education--A Crowdsourcing Challenge. Proc. AAAI, workshop on Human Computation, (2012), 159--163.Google Scholar
Index Terms
- Divide and correct: using clusters to grade short answers at scale
Recommendations
Assessment that matters: balancing reliability and learner-centered pedagogy in MOOC assessment
LAK '20: Proceedings of the Tenth International Conference on Learning Analytics & KnowledgeLearner-centered pedagogy highlights active learning and formative feedback. Instructors often incentivize learners to engage in such formative assessment activities by crediting their completion and score in the final grade, a pedagogical practice that ...
Investigating Learners’ Views of Assessment Types in Massive Open Online Courses (MOOCs)
Design for Teaching and Learning in a Networked WorldAbstractMassive Open Online Courses (MOOCs) are changing the contours of the teaching and learning landscape. Assessment covers an important part of this landscape and may be a key driver for learning. This paper presents preliminary results of a ...
Self-evaluation in advanced power searching and mapping with google MOOCs
L@S '14: Proceedings of the first ACM conference on Learning @ scale conferenceWhile there is a large amount of work on creating autograded massive open online courses (MOOCs), some kinds of complex, qualitative exam questions are still beyond the current state of the art. For MOOCs that need to deal with these kinds of questions, ...
Comments