Abstract:Automated methods are becoming increasingly integrated into studies of formative feedback on students' science explanation writing. Most of this work, however, addresses students' responses to short answer questions. We investigate automated feedback on students' science explanation essays, where students must articulate multiple ideas. Feedback is based on a rubric that identifies the main ideas students are prompted to include in explanatory essays about the physics of energy and mass, given their experiments with a simulated roller coaster. We have found that students generally improve on revised versions of their essays. Here, however, we focus on two factors that affect the accuracy of the automated feedback. First, we find that the main ideas in the rubric differ with respect to how much freedom they afford in explanations of the idea, thus explanation of a natural law is relatively constrained. Students have more freedom in how they explain complex relations they observe in their roller coasters, such as transfer of different forms of energy. Second, by tracing the automated decision process, we can diagnose when a student's statement lacks sufficient clarity for the automated tool to associate it more strongly with one of the main ideas above all others. This in turn provides an opportunity for teachers and peers to help students reflect on how to state their ideas more clearly.
Abstract:With an increasing focus in STEM education on critical thinking skills, science writing plays an ever more important role in curricula that stress inquiry skills. A recently published dataset of two sets of college level lab reports from an inquiry-based physics curriculum relies on analytic assessment rubrics that utilize multiple dimensions, specifying subject matter knowledge and general components of good explanations. Each analytic dimension is assessed on a 6-point scale, to provide detailed feedback to students that can help them improve their science writing skills. Manual assessment can be slow, and difficult to calibrate for consistency across all students in large classes. While much work exists on automated assessment of open-ended questions in STEM subjects, there has been far less work on long-form writing such as lab reports. We present an end-to-end neural architecture that has separate verifier and assessment modules, inspired by approaches to Open Domain Question Answering (OpenQA). VerAs first verifies whether a report contains any content relevant to a given rubric dimension, and if so, assesses the relevant sentences. On the lab reports, VerAs outperforms multiple baselines based on OpenQA systems or Automated Essay Scoring (AES). VerAs also performs well on an analytic rubric for middle school physics essays.