Automated Scoring of Graphical Open-Ended Responses Using Artificial Neural Networks

by   Matthias von Davier, et al.

Automated scoring of free drawings or images as responses has yet to be utilized in large-scale assessments of student achievement. In this study, we propose artificial neural networks to classify these types of graphical responses from a computer based international mathematics and science assessment. We are comparing classification accuracy of convolutional and feedforward approaches. Our results show that convolutional neural networks (CNNs) outperform feedforward neural networks in both loss and accuracy. The CNN models classified up to 97.71 scoring category, which is comparable to, if not more accurate, than typical human raters. These findings were further strengthened by the observation that the most accurate CNN models correctly classified some image responses that had been incorrectly scored by the human raters. As an additional innovation, we outline a method to select human rated responses for the training sample based on an application of the expected response function derived from item response theory. This paper argues that CNN-based automated scoring of image responses is a highly accurate procedure that could potentially replace the workload and cost of second human raters for large scale assessments, while improving the validity and comparability of scoring complex constructed-response items.


page 1

page 2

page 3

page 4


Modeling and Analyzing Scorer Preferences in Short-Answer Math Questions

Automated scoring of student responses to open-ended questions, includin...

Neural network approach to classifying alarming student responses to online assessment

Automated scoring engines are increasingly being used to score the free-...

Using Active Learning Methods to Strategically Select Essays for Automated Scoring

Research on automated essay scoring has become increasing important beca...

Comparing Human and Automated Evaluation of Open-Ended Student Responses to Questions of Evolution

Written responses can provide a wealth of data in understanding student ...

Using language models in the implicit automated assessment of mathematical short answer items

We propose a new way to assess certain short constructed responses to ma...

Modeling Item Response Theory with Stochastic Variational Inference

Item Response Theory (IRT) is a ubiquitous model for understanding human...

An Interpretable Deep Learning System for Automatically Scoring Request for Proposals

The Managed Care system within Medicaid (US Healthcare) uses Request For...

Please sign up or login with your details

Forgot password? Click here to reset