Model compression method and related device

The embodiment of the invention provides a model compression method and a related device, and belongs to the technical field of artificial intelligence. The method comprises the following steps: performing model training by using training data to obtain a first deep learning model, and obtaining a f...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: PAN WEIWEI, YANG-GONG YIFAN, CHUANG XIAOMING, ZHENG HANXUN
Format: Patent
Sprache:chi ; eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The embodiment of the invention provides a model compression method and a related device, and belongs to the technical field of artificial intelligence. The method comprises the following steps: performing model training by using training data to obtain a first deep learning model, and obtaining a first importance score corresponding to a self-attention layer of the first deep learning model; pruning a self-attention layer of the first deep learning model according to the first importance score to obtain a second deep learning model; retraining the second deep learning model by using the training data to obtain a third deep learning model, and obtaining a second importance score corresponding to a self-attention layer in the third deep learning model; performing precision quantification on the third deep learning model according to the second importance score to obtain a fourth deep learning model; and performing distillation processing on the fourth deep learning model by using the first deep learning model