Method and apparatus with neural network parameter quantization
A processor-implemented method includes determining a first quantization value by performing log quantization on a parameter from one of input activation values and weight values in a layer of a neural network, comparing a threshold value with an error between a first dequantization value obtained b...
Gespeichert in:
Hauptverfasser: | , , |
---|---|
Format: | Patent |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | A processor-implemented method includes determining a first quantization value by performing log quantization on a parameter from one of input activation values and weight values in a layer of a neural network, comparing a threshold value with an error between a first dequantization value obtained by dequantization of the first quantization value and the parameter, determining a second quantization value by performing log quantization on the error in response to the error being greater than the threshold value as a result of the comparing; and quantizing the parameter to a value in which the first quantization value and the second quantization value are grouped. |
---|