Training Language Models to Self-Correct via Reinforcement Learning

Self-correction is a highly desirable capability of large language models (LLMs), yet it has consistently been found to be largely ineffective in modern LLMs. Current methods for training self-correction typically depend on either multiple models, a more advanced model, or additional forms of superv...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Kumar, Aviral, Zhuang, Vincent, Agarwal, Rishabh, Su, Yi, Co-Reyes, John D, Singh, Avi, Baumli, Kate, Iqbal, Shariq, Bishop, Colton, Roelofs, Rebecca, Zhang, Lei M, McKinney, Kay, Shrivastava, Disha, Paduraru, Cosmin, Tucker, George, Precup, Doina, Behbahani, Feryal, Faust, Aleksandra
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!