EFFICIENT QUANTIZATION FOR NEURAL NETWORK DEPLOYMENT AND EXECUTION

Implementations disclosed describe methods and systems to perform the methods of deploying and executing machine learning models on target-specific computational platforms. Optimization techniques include but are not limited to alignment of kernel operations with hardware instructions of a target pr...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Ramanna, Vikram Kumar, Pandey, Ashutosh, Li, Kaiping
Format:	Patent
Sprache:	eng
Schlagworte:	CALCULATING COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS COMPUTING COUNTING HANDLING RECORD CARRIERS PHYSICS PRESENTATION OF DATA RECOGNITION OF DATA RECORD CARRIERS
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Implementations disclosed describe methods and systems to perform the methods of deploying and executing machine learning models on target-specific computational platforms. Optimization techniques include but are not limited to alignment of kernel operations with hardware instructions of a target processing device, reduction of kernel dimensions near boundaries of data, efficient reuse of a small number of memory components during neural network operations, run-time quantization of data and neural network parameters, and other methods.