research-article

Divide and correct: using clusters to grade short answers at scale

Authors:
Michael Brooks

University of Washington, Seattle, WA, USA

University of Washington, Seattle, WA, USA
View Profile

,
Sumit Basu

Microsoft Research, Redmond, WA, USA

Microsoft Research, Redmond, WA, USA
View Profile

,
Charles Jacobs

Microsoft Research, Redmond, WA, USA

Microsoft Research, Redmond, WA, USA
View Profile

,
Lucy Vanderwende

Microsoft Research, Redmond, WA, USA

Microsoft Research, Redmond, WA, USA
View Profile

L@S '14: Proceedings of the first ACM conference on Learning @ scale conferenceMarch 2014Pages 89–98https://doi.org/10.1145/2556325.2566243

Published:04 March 2014Publication History

L@S '14: Proceedings of the first ACM conference on Learning @ scale conference

Pages 89–98

ABSTRACT

In comparison to multiple choice or other recognition-oriented forms of assessment, short answer questions have been shown to offer greater value for both students and teachers; for students they can improve retention of knowledge, while for teachers they provide more insight into student understanding. Unfortunately, the same open-ended nature which makes them so valuable also makes them more difficult to grade at scale. To address this, we propose a cluster-based interface that allows teachers to read, grade, and provide feedback on large groups of answers at once. We evaluated this interface against an unclustered baseline in a within-subjects study with 25 teachers, and found that the clustered interface allows teachers to grade substantially faster, to give more feedback to students, and to develop a high-level view of students' understanding and misconceptions.

References

Anderson, R. and Biddle, W. On asking people questions about what they are reading. Psychology of learning and motivation 9, (1975).Google Scholar
Basu, S., Jacobs, C., and Vanderwende, L. Powergrading: a Clustering Approach to Amplify Human Effort for Short Answer Grading. TACL 1, (2013), 391--402.Google ScholarCross Ref
Bloom, B. The 2 sigma problem: The search for methods of group instruction as effective as one-to-one tutoring. Edu. Researcher 13, 6 (1984), 4--16.Google Scholar
Brookhart, S. Teachers' grading: Practice and theory. Applied Measurement in Edu., (1994).Google Scholar
Cross, L. and Frary, R. Hodgepodge grading: Endorsed by students and teachers alike. Applied Measurement in Edu., October 2013 (1999), 37--41.Google Scholar
Hearst, M. The debate on automated essay grading. Intelligent Systems and their Applications, (2000).Google Scholar
Heywood, J. Assessment in higher education: Student learning, teaching, programmes and institutions. 2000.Google Scholar
Jordan, S. and Mitchell, T. e-Assessment for learning? The potential of short-answer free-text questions with tailored feedback. British Journal of Edu. Tech. 40, 2 (2009), 371--385.Google Scholar
Karpicke, J.D. and Roediger, H.L. The critical importance of retrieval for learning. Science 319, 5865 (2008), 966--8.Google ScholarCross Ref
Markoff, J. Essay-Grading Software Offers Professors a Break. The New York Times, 2013.Google Scholar
McMillan, J. Secondary teachers' classroom assessment and grading practices. Edu. Measurement: Issues and Practice, (2001).Google Scholar
Mohler, M.A.G., Bunescu, R., and Mihalcea, R. Learning to Grade Short Answer Questions using Semantic Similarity Measures and Dependency Graph Alignments. Proc. ACL, (2011). Google ScholarDigital Library
Mory, E. Feedback research revisited. In D.J. Mahwah, ed., Handbook of Research on Educational Communications and Technology. 2004, 745--784.Google Scholar
Perelman, L. Critique (Ver. 3.4) of Mark D. Shermis & Ben Hammer, "Contrasting State-of-the-Art Automated Scoring of Essays: Analysis."2013.Google Scholar
Piech, C., Huang, J., Chen, Z., Do, C., Ng, A., and Koller, D. Tuned Models of Peer Assessment in MOOCs. Proc. EDM, (2013).Google Scholar
Poulos, A. and Mahony, M.J. Effectiveness of feedback: the students' perspective. Assessment & Evaluation in Higher Edu. 33, 2 (2008), 143--154.Google Scholar
Reily, K., Finnerty, P.L., and Terveen, L. Two peers are better than one: aggregating peer reviews for computing assignments is surprisingly accurate. Proc. GROUP, ACM Press (2009), 115. Google ScholarDigital Library
Sadler, P. and Good, E. The Impact of Self- and Peer-Grading on Student Learning. Edu. Assessment 11, 1 (2006), 1--31.Google ScholarCross Ref
Scriven, M. The methodology of evaluation. In R.E. Stake, ed., AERSA Monograph Series on Curriculum Evaluation. Rand McNally, Chicago, 1967.Google Scholar
Thorpe, M. Assessment and "third generation" distance education. Distance Edu. 19, 2 (1998), 265--286.Google ScholarCross Ref
Weld, D., Adar, E., and Chilton, L. Personalized Online Education--A Crowdsourcing Challenge. Proc. AAAI, workshop on Human Computation, (2012), 159--163.Google Scholar

Index Terms

Divide and correct: using clusters to grade short answers at scale
1. Human-centered computing
  1. Human computer interaction (HCI)
    1. HCI design and evaluation methods

Recommendations

Assessment that matters: balancing reliability and learner-centered pedagogy in MOOC assessment
LAK '20: Proceedings of the Tenth International Conference on Learning Analytics & Knowledge

Learner-centered pedagogy highlights active learning and formative feedback. Instructors often incentivize learners to engage in such formative assessment activities by crediting their completion and score in the final grade, a pedagogical practice that ...
Read More
Investigating Learners’ Views of Assessment Types in Massive Open Online Courses (MOOCs)
Design for Teaching and Learning in a Networked World
Abstract
Massive Open Online Courses (MOOCs) are changing the contours of the teaching and learning landscape. Assessment covers an important part of this landscape and may be a key driver for learning. This paper presents preliminary results of a ...
Read More
Self-evaluation in advanced power searching and mapping with google MOOCs
L@S '14: Proceedings of the first ACM conference on Learning @ scale conference

While there is a large amount of work on creating autograded massive open online courses (MOOCs), some kinds of complex, qualitative exam questions are still beyond the current state of the art. For MOOCs that need to deal with these kinds of questions, ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
L@S '14: Proceedings of the first ACM conference on Learning @ scale conference
March 2014
234 pages
ISBN:9781450326698
DOI:10.1145/2556325
General Chair:
Mehran Sahami
Stanford University, USA
,
Program Chairs:
Armando Fox
University of California, Berkeley, USA
,
Marti A. Hearst
University of California, Berkeley, USA
,
Michelene T.H. Chi
Arizona State University, USA
Copyright © 2014 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 4 March 2014
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
assessment
clustering
clustering interfaces
grading
grading interfaces
moocs
user interfaces
Qualifiers
- research-article
Conference

Acceptance Rates
L@S '14 Paper Acceptance Rate14of38submissions,37%Overall Acceptance Rate117of440submissions,27%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 38
  Total Citations
  View Citations
- 573
  Total Downloads
- Downloads (Last 12 months)25
- Downloads (Last 6 weeks)3
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Divide and correct: using clusters to grade short answers at scale

L@S '14: Proceedings of the first ACM conference on Learning @ scale conference

ABSTRACT

References

Cited By

Index Terms

Recommendations

Assessment that matters: balancing reliability and learner-centered pedagogy in MOOC assessment

Investigating Learners’ Views of Assessment Types in Massive Open Online Courses (MOOCs)

Self-evaluation in advanced power searching and mapping with google MOOCs