CoachLM: Automatic Instruction Revisions Improve the Data Quality in LLM Instruction Tuning
Instruction tuning is crucial for enabling Language Learning Models (LLMs) in responding to human instructions. The quality of instruction pairs used for tuning greatly affects the performance of LLMs. However, the manual creation of high-quality instruction datasets is costly, leading to the adopti...
Gespeichert in:
Hauptverfasser: | , , , , , , , , , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Instruction tuning is crucial for enabling Language Learning Models (LLMs) in
responding to human instructions. The quality of instruction pairs used for
tuning greatly affects the performance of LLMs. However, the manual creation of
high-quality instruction datasets is costly, leading to the adoption of
automatic generation of instruction pairs by LLMs as a popular alternative. To
ensure the high quality of LLM-generated instruction datasets, several
approaches have been proposed. Nevertheless, existing methods either compromise
dataset integrity by filtering a large proportion of samples, or are unsuitable
for industrial applications. In this paper, instead of discarding low-quality
samples, we propose CoachLM, a novel approach to enhance the quality of
instruction datasets through automatic revisions on samples in the dataset.
CoachLM is trained from the samples revised by human experts and significantly
increases the proportion of high-quality samples in the dataset from 17.7% to
78.9%. The effectiveness of CoachLM is further assessed on various real-world
instruction test sets. The results show that CoachLM improves the
instruction-following capabilities of the instruction-tuned LLM by an average
of 29.9%, which even surpasses larger LLMs with nearly twice the number of
parameters. Furthermore, CoachLM is successfully deployed in a data management
system for LLMs at Huawei, resulting in an efficiency improvement of up to 20%
in the cleaning of 40k real-world instruction pairs. We release various assets
of CoachLM, including the training data, code and test set
(https://github.com/lunyiliu/CoachLM). |
---|---|
DOI: | 10.48550/arxiv.2311.13246 |