Augmenting Math Word Problems via Iterative Question Composing
Despite the advancements in large language models (LLMs) for mathematical reasoning, solving competition-level math problems remains a significant challenge, especially for open-source LLMs without external tools. We introduce the MMIQC dataset, comprising a mixture of processed web data and synthet...
Gespeichert in:
Hauptverfasser: | , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Despite the advancements in large language models (LLMs) for mathematical
reasoning, solving competition-level math problems remains a significant
challenge, especially for open-source LLMs without external tools. We introduce
the MMIQC dataset, comprising a mixture of processed web data and synthetic
question-response pairs, aimed at enhancing the mathematical reasoning
capabilities of base language models. Models fine-tuned on MMIQC consistently
surpass their counterparts in performance on the MATH benchmark across various
model sizes. Notably, Qwen-72B-MMIQC achieves a 45.0% accuracy, exceeding the
previous open-source state-of-the-art by 8.2% and outperforming the initial
version GPT-4 released in 2023. Extensive evaluation results on Hungarian high
school finals suggest that such improvement can generalize to unseen data. Our
ablation study on MMIQC reveals that a large part of the improvement can be
attributed to our novel augmentation method, Iterative Question Composing
(IQC), which involves iteratively composing new questions from seed problems
using an LLM and applying rejection sampling through another LLM. |
---|---|
DOI: | 10.48550/arxiv.2401.09003 |