Zephyr: Direct Distillation of LM Alignment

We aim to produce a smaller language model that is aligned to user intent. Previous research has shown that applying distilled supervised fine-tuning (dSFT) on larger models significantly improves task accuracy; however, these models are unaligned, i.e. they do not respond well to natural prompts. T...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	arXiv.org 2023-10
Hauptverfasser:	Tunstall, Lewis, Beeching, Edward, Lambert, Nathan, Rajani, Nazneen, Rasul, Kashif, Younes Belkada, Huang, Shengyi, Leandro von Werra, Fourrier, Clémentine, Habib, Nathan, Sarrazin, Nathan, Sanseviero, Omar, Rush, Alexander M, Wolf, Thomas
Format:	Artikel
Sprache:	eng
Schlagworte:	Alignment Annotations Distillation Model accuracy Optimization
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Schreiben Sie den ersten Kommentar!