SCALING HALF-PRECISION FLOATING POINT TENSORS FOR TRAINING DEEP NEURAL NETWORKS

One embodiment provides for a machine-learning accelerator device a multiprocessor to execute parallel threads of an instruction stream, the multiprocessor including a compute unit, the compute unit including a set of functional units, each functional unit to execute at least one of the parallel thr...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	DAS, DIPANKAR, MELLEMPUDI, NAVEEN
Format:	Patent
Sprache:	eng
Schlagworte:	CALCULATING COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS COMPUTING COUNTING ELECTRIC DIGITAL DATA PROCESSING IMAGE DATA PROCESSING OR GENERATION, IN GENERAL PHYSICS
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	One embodiment provides for a machine-learning accelerator device a multiprocessor to execute parallel threads of an instruction stream, the multiprocessor including a compute unit, the compute unit including a set of functional units, each functional unit to execute at least one of the parallel threads of the instruction stream. The compute unit includes compute logic configured to execute a single instruction to scale an input tensor associated with a layer of a neural network according to a scale factor, the input tensor stored in a floating-point data type, the compute logic to scale the input tensor to enable a data distribution of data of the input tensor to be represented by a 16-bit floating point data type.