Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Menna Fateen

Developing a Tutoring Dialog Dataset to Optimize LLMs for Educational Use

Oct 25, 2024

Menna Fateen, Tsunenori Mine

Abstract:Recent advances in large language models (LLMs) have shown promise for scalable educational applications, but their use in dialog-based tutoring systems remains challenging due to the need for effective pedagogical strategies and the high costs associated with expert-curated datasets. Our study explores the use of smaller, more affordable LLMs for one-on-one tutoring in the context of solving reading comprehension problems. We developed a synthetic tutoring dialog dataset, evaluated by human teachers, and fine-tuned a smaller LLM using this dataset. Furthermore, we conducted an interactive experiment comparing the performance of the fine-tuned model with a larger model in real-world tutoring scenarios. Our results show that the fine-tuned model performs on par with the larger model but at a lower cost, demonstrating a viable, cost-effective approach for implementing LLM-based tutoring systems in educational settings.

Via

Access Paper or Ask Questions

Beyond Scores: A Modular RAG-Based System for Automatic Short Answer Scoring with Feedback

Sep 30, 2024

Menna Fateen, Bo Wang, Tsunenori Mine

Figure 1 for Beyond Scores: A Modular RAG-Based System for Automatic Short Answer Scoring with Feedback

Figure 2 for Beyond Scores: A Modular RAG-Based System for Automatic Short Answer Scoring with Feedback

Figure 3 for Beyond Scores: A Modular RAG-Based System for Automatic Short Answer Scoring with Feedback

Figure 4 for Beyond Scores: A Modular RAG-Based System for Automatic Short Answer Scoring with Feedback

Abstract:Automatic short answer scoring (ASAS) helps reduce the grading burden on educators but often lacks detailed, explainable feedback. Existing methods in ASAS with feedback (ASAS-F) rely on fine-tuning language models with limited datasets, which is resource-intensive and struggles to generalize across contexts. Recent approaches using large language models (LLMs) have focused on scoring without extensive fine-tuning. However, they often rely heavily on prompt engineering and either fail to generate elaborated feedback or do not adequately evaluate it. In this paper, we propose a modular retrieval augmented generation based ASAS-F system that scores answers and generates feedback in strict zero-shot and few-shot learning scenarios. We design our system to be adaptable to various educational tasks without extensive prompt engineering using an automatic prompt generation framework. Results show an improvement in scoring accuracy by 9\% on unseen questions compared to fine-tuning, offering a scalable and cost-effective solution.

Via

Access Paper or Ask Questions