TASK PROCESSING METHOD AND APPARATUS BASED ON MODEL QUANTIZATION, AND DEVICE AND STORAGE MEDIUM

Provided in the present disclosure are a task processing method and apparatus based on model quantization, and a device and a storage medium. The task processing method comprises: according to a first difference between a first quantization output of an optimization unit in a transformer model and a...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: LI, Zheyang, ZHANG, Kai, LAN, Chaoxiang
Format: Patent
Sprache:chi ; eng ; fre
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Provided in the present disclosure are a task processing method and apparatus based on model quantization, and a device and a storage medium. The task processing method comprises: according to a first difference between a first quantization output of an optimization unit in a transformer model and a first floating-point output of same, updating a weight quantization coefficient of the optimization unit and an activation quantization coefficient of same; according to a second difference between a second quantization output of the optimization unit and a second floating-point output of same, updating a weight quantization increment of the optimization unit; determining a weight quantization rounding direction for the optimization unit according to a target weight quantization increment, and performing quantization on a weight parameter of the optimization unit according to a target weight quantization coefficient and the weight quantization rounding direction; and performing forward reasoning calculation on inp