Eliciting Better Multilingual Structured Reasoning from LLMs through Code
The development of large language models (LLM) has shown progress on reasoning, though studies have largely considered either English or simple reasoning tasks. To address this, we introduce a multilingual structured reasoning and explanation dataset, termed xSTREET, that covers four tasks across si...
Gespeichert in:
Hauptverfasser: | , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | The development of large language models (LLM) has shown progress on
reasoning, though studies have largely considered either English or simple
reasoning tasks. To address this, we introduce a multilingual structured
reasoning and explanation dataset, termed xSTREET, that covers four tasks
across six languages. xSTREET exposes a gap in base LLM performance between
English and non-English reasoning tasks.
We then propose two methods to remedy this gap, building on the insight that
LLMs trained on code are better reasoners. First, at training time, we augment
a code dataset with multilingual comments using machine translation while
keeping program code as-is. Second, at inference time, we bridge the gap
between training and inference by employing a prompt structure that
incorporates step-by-step code primitives to derive new facts and find a
solution. Our methods show improved multilingual performance on xSTREET, most
notably on the scientific commonsense reasoning subtask. Furthermore, the
models show no regression on non-reasoning tasks, thus demonstrating our
techniques maintain general-purpose abilities. |
---|---|
DOI: | 10.48550/arxiv.2403.02567 |