ALLOCATING COMPUTING RESOURCES BETWEEN MODEL SIZE AND TRAINING DATA DURING TRAINING OF A MACHINE LEARNING MODEL

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training a machine learning model to perform a machine learning task. In one aspect, a method performed by one or more computer is described. The method includes: obtaining data defining a compute...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Sifre, Laurent, Borgeaud Dit Avocat, Sebastian, Hoffmann, Jordan, Mensch, Arthur
Format: Patent
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training a machine learning model to perform a machine learning task. In one aspect, a method performed by one or more computer is described. The method includes: obtaining data defining a compute budget that characterizes an amount of computing resources allocated for training a machine learning model to perform a machine learning task; processing the data defining the compute budget using an allocation mapping, in accordance with a set of allocation mapping parameters, to generate an allocation tuple defining: (i) a target model size for the machine learning model, and (ii) a target amount of training data for training the machine learning model; instantiating the machine learning model, where the machine learning model has the target model size; and obtaining the target amount of training data for training the machine learning model.