Automated Assessment of Students' Code Comprehension using LLMs
Assessing student's answers and in particular natural language answers is a crucial challenge in the field of education. Advances in machine learning, including transformer-based models such as Large Language Models(LLMs), have led to significant progress in various natural language tasks. Neve...
Gespeichert in:
Hauptverfasser: | , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Assessing student's answers and in particular natural language answers is a
crucial challenge in the field of education. Advances in machine learning,
including transformer-based models such as Large Language Models(LLMs), have
led to significant progress in various natural language tasks. Nevertheless,
amidst the growing trend of evaluating LLMs across diverse tasks, evaluating
LLMs in the realm of automated answer assesment has not received much
attention. To address this gap, we explore the potential of using LLMs for
automated assessment of student's short and open-ended answer. Particularly, we
use LLMs to compare students' explanations with expert explanations in the
context of line-by-line explanations of computer programs.
For comparison purposes, we assess both Large Language Models (LLMs) and
encoder-based Semantic Textual Similarity (STS) models in the context of
assessing the correctness of students' explanation of computer code. Our
findings indicate that LLMs, when prompted in few-shot and chain-of-thought
setting perform comparable to fine-tuned encoder-based models in evaluating
students' short answers in programming domain. |
---|---|
DOI: | 10.48550/arxiv.2401.05399 |