NEURAL NETWORK OPERATION APPARATUS AND QUANTIZATION METHOD
A neural network operation apparatus and method implementing quantization is disclosed. The neural network operation method may include receiving a weight of a neural network, a candidate set of quantization points, and a bitwidth for representing the weight, extracting a subset of quantization poin...
Gespeichert in:
Hauptverfasser: | , , |
---|---|
Format: | Patent |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | A neural network operation apparatus and method implementing quantization is disclosed. The neural network operation method may include receiving a weight of a neural network, a candidate set of quantization points, and a bitwidth for representing the weight, extracting a subset of quantization points from the candidate set of quantization points based on the bitwidth, calculating a quantization loss based on the weight of the neural network and the subset of quantization points, and generating a target subset of quantization points based on the quantization loss. |
---|