DEEP LEARNING THREAD COMMUNICATION

An application binary interface (ABI) can be exposed in a processor to enable blocks of threads, which may correspond to separately compiled operators, to communicate without storing data to global memory external to the processor. The ABI can define how results of one computation, corresponding to...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: FAHS, Brian, LIGHTSTONE, Michael, HAGOG, Mostafa
Format: Patent
Sprache:eng ; fre ; ger
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:An application binary interface (ABI) can be exposed in a processor to enable blocks of threads, which may correspond to separately compiled operators, to communicate without storing data to global memory external to the processor. The ABI can define how results of one computation, corresponding to a first thread block, will be organized in registers and shared memory of a processor at the end of one operator (i.e.,kernel). The start of the next operator (i.e.,kernel), corresponding to a second thread block, can consume the results from the registers and shared memory. Data can be stored to processor local storage for individual threads as they exit the block. Once published, libraries can be separately compiled, optimized, and tested as long as they adhere to the published ABI.