Scaling Laws for Autoregressive Generative Modeling

We identify empirical scaling laws for the cross-entropy loss in four domains: generative image modeling, video modeling, multimodal image$\leftrightarrow$text models, and mathematical problem solving. In all cases autoregressive Transformers smoothly improve in performance as model size and compute...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Henighan, Tom, Kaplan, Jared, Katz, Mor, Chen, Mark, Hesse, Christopher, Jackson, Jacob, Jun, Heewoo, Brown, Tom B, Dhariwal, Prafulla, Gray, Scott, Hallacy, Chris, Mann, Benjamin, Radford, Alec, Ramesh, Aditya, Ryder, Nick, Ziegler, Daniel M, Schulman, John, Amodei, Dario, McCandlish, Sam
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!