Abstract:Grading short answer questions automatically with interpretable reasoning behind the grading decision is a challenging goal for current transformer approaches. Justification cue detection, in combination with logical reasoners, has shown a promising direction for neuro-symbolic architectures in ASAG. But, one of the main challenges is the requirement of annotated justification cues in the students' responses, which only exist for a few ASAG datasets. To overcome this challenge, we contribute (1) a weakly supervised annotation procedure for justification cues in ASAG datasets, and (2) a neuro-symbolic model for explainable ASAG based on justification cues. Our approach improves upon the RMSE by 0.24 to 0.3 compared to the state-of-the-art on the Short Answer Feedback dataset in a bilingual, multi-domain, and multi-question training setup. This result shows that our approach provides a promising direction for generating high-quality grades and accompanying explanations for future research in ASAG and educational NLP.
Abstract:Neural machine translation (NMT) is a widely popular text generation task, yet there is a considerable research gap in the development of privacy-preserving NMT models, despite significant data privacy concerns for NMT systems. Differentially private stochastic gradient descent (DP-SGD) is a popular method for training machine learning models with concrete privacy guarantees; however, the implementation specifics of training a model with DP-SGD are not always clarified in existing models, with differing software libraries used and code bases not always being public, leading to reproducibility issues. To tackle this, we introduce DP-NMT, an open-source framework for carrying out research on privacy-preserving NMT with DP-SGD, bringing together numerous models, datasets, and evaluation metrics in one systematic software package. Our goal is to provide a platform for researchers to advance the development of privacy-preserving NMT systems, keeping the specific details of the DP-SGD algorithm transparent and intuitive to implement. We run a set of experiments on datasets from both general and privacy-related domains to demonstrate our framework in use. We make our framework publicly available and welcome feedback from the community.