STaR: Bootstrapping Reasoning With Reasoning
Generating step-by-step "chain-of-thought" rationales improves language model performance on complex reasoning tasks like mathematics or commonsense question-answering. However, inducing language model rationale generation currently requires either constructing massive rationale datasets o...
Gespeichert in:
Hauptverfasser: | , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Generating step-by-step "chain-of-thought" rationales improves language model
performance on complex reasoning tasks like mathematics or commonsense
question-answering. However, inducing language model rationale generation
currently requires either constructing massive rationale datasets or
sacrificing accuracy by using only few-shot inference. We propose a technique
to iteratively leverage a small number of rationale examples and a large
dataset without rationales, to bootstrap the ability to perform successively
more complex reasoning. This technique, the "Self-Taught Reasoner" (STaR),
relies on a simple loop: generate rationales to answer many questions, prompted
with a few rationale examples; if the generated answers are wrong, try again to
generate a rationale given the correct answer; fine-tune on all the rationales
that ultimately yielded correct answers; repeat. We show that STaR
significantly improves performance on multiple datasets compared to a model
fine-tuned to directly predict final answers, and performs comparably to
fine-tuning a 30$\times$ larger state-of-the-art language model on
CommensenseQA. Thus, STaR lets a model improve itself by learning from its own
generated reasoning. |
---|---|
DOI: | 10.48550/arxiv.2203.14465 |