Model compression method and related device
The embodiment of the invention provides a model compression method and a related device, and belongs to the technical field of artificial intelligence. The method comprises the following steps: performing model training by using training data to obtain a first deep learning model, and obtaining a f...
Gespeichert in:
Hauptverfasser: | , , , |
---|---|
Format: | Patent |
Sprache: | chi ; eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | The embodiment of the invention provides a model compression method and a related device, and belongs to the technical field of artificial intelligence. The method comprises the following steps: performing model training by using training data to obtain a first deep learning model, and obtaining a first importance score corresponding to a self-attention layer of the first deep learning model; pruning a self-attention layer of the first deep learning model according to the first importance score to obtain a second deep learning model; retraining the second deep learning model by using the training data to obtain a third deep learning model, and obtaining a second importance score corresponding to a self-attention layer in the third deep learning model; performing precision quantification on the third deep learning model according to the second importance score to obtain a fourth deep learning model; and performing distillation processing on the fourth deep learning model by using the first deep learning model |
---|