It's Not Easy Being Wrong: Large Language Models Struggle with Process of Elimination Reasoning
Chain-of-thought (COT) prompting can help large language models (LLMs) reason toward correct answers, but its efficacy in reasoning toward incorrect answers is unexplored. This process of elimination (PoE), when used with COT, can enhance self-consistency, interpretability, and tasks such as medical...
Gespeichert in:
Hauptverfasser: | , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Chain-of-thought (COT) prompting can help large language models (LLMs) reason
toward correct answers, but its efficacy in reasoning toward incorrect answers
is unexplored. This process of elimination (PoE), when used with COT, can
enhance self-consistency, interpretability, and tasks such as medical diagnoses
of exclusion. Thus, we propose PoE with COT, where LLMs must reason toward
incorrect options on multiple-choice questions. We evaluate the ability of
GPT-3.5, LLaMA-2, and Falcon to perform PoE with COT on a total of four
commonsense and scientific reasoning datasets. We find that the strategy of PoE
always underperforms the strategy of choosing the correct answer. The agreement
of these strategies is also lower than the self-consistency of each strategy.
To study these issues further, we conduct error analyses and give suggestions
for future work. |
---|---|
DOI: | 10.48550/arxiv.2311.07532 |