G-SciEdBERT: A Contextualized LLM for Science Assessment Tasks in German
The advancement of natural language processing has paved the way for automated scoring systems in various languages, such as German (e.g., German BERT [G-BERT]). Automatically scoring written responses to science questions in German is a complex task and challenging for standard G-BERT as they lack...
Gespeichert in:
Hauptverfasser: | , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | The advancement of natural language processing has paved the way for
automated scoring systems in various languages, such as German (e.g., German
BERT [G-BERT]). Automatically scoring written responses to science questions in
German is a complex task and challenging for standard G-BERT as they lack
contextual knowledge in the science domain and may be unaligned with student
writing styles. This paper presents a contextualized German Science Education
BERT (G-SciEdBERT), an innovative large language model tailored for scoring
German-written responses to science tasks and beyond. Using G-BERT, we
pre-trained G-SciEdBERT on a corpus of 30K German written science responses
with 3M tokens on the Programme for International Student Assessment (PISA)
2018. We fine-tuned G-SciEdBERT on an additional 20K student-written responses
with 2M tokens and examined the scoring accuracy. We then compared its scoring
performance with G-BERT. Our findings revealed a substantial improvement in
scoring accuracy with G-SciEdBERT, demonstrating a 10.2% increase of quadratic
weighted Kappa compared to G-BERT (mean difference = 0.1026, SD = 0.069). These
insights underline the significance of specialized language models like
G-SciEdBERT, which is trained to enhance the accuracy of contextualized
automated scoring, offering a substantial contribution to the field of AI in
education. |
---|---|
DOI: | 10.48550/arxiv.2402.06584 |