Accelerated training method, device and equipment for large model and storage medium

The invention discloses an accelerated training method and device for a large model, equipment and a storage medium, and relates to the technical field of large models. The method comprises the steps that according to a word segmentation model and a process number in a large model distributed traini...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: XU XUEFAN, ZHOU ZHENGMAO, MU YUZHI, ZHANG JIAN, YE ZAISEN, CHEN ZHIGANG, HAN WEI, WANG ZIHAO, WANG ZI
Format: Patent
Sprache:chi ; eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The invention discloses an accelerated training method and device for a large model, equipment and a storage medium, and relates to the technical field of large models. The method comprises the steps that according to a word segmentation model and a process number in a large model distributed training process, sample text data in a sample text data set is fragmented and bucket-divided, and a new text fragmentation file is obtained; loading process text word metadata corresponding to each training process from the new text fragment file; for each training iteration step in the large model distributed training, for each training process in the training iteration step, training the initial large model by adopting process text word metadata corresponding to the training process in the training iteration step and label data of the process text word metadata, obtaining process training loss of the training process in the training iteration step; and according to the process training loss of each training process in