SYSTEMS AND METHODS FOR PERFORMING 16-BIT FLOATING-POINT VECTOR DOT PRODUCT INSTRUCTIONS

Disclosed embodiments relate to systems and methods for performing a floating-point dot product instruction. In one example, a processor includes fetch circuitry to fetch a single instruction having fields to specify an opcode, a writemask, and locations of first source, second source, and destinati...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	ADELMAN, MENA, HEINECKE, ALEXANDER F, SPERBER, ZEEV, RUBANOVICH, SIMON, CHARNEY, MARK J, VALENTINE, ROBERT, GRADSTEIN, AMIT, SADE, RAANAN
Format:	Patent
Sprache:	eng ; pol
Schlagworte:	CALCULATING COMPUTING COUNTING ELECTRIC DIGITAL DATA PROCESSING PHYSICS
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Disclosed embodiments relate to systems and methods for performing a floating-point dot product instruction. In one example, a processor includes fetch circuitry to fetch a single instruction having fields to specify an opcode, a writemask, and locations of first source, second source, and destination vectors, decode circuitry to decode the fetched instruction, and execution circuitry to execute the instruction as per the opcode. The writemask is to control whether to mask the destination vector, with masked elements of the destination vector being either zeroed or merged. For elements which are not masked, the opcode is to indicate execution circuitry to generate products of N pairs of 16-bit floating-point elements of the first and second source vectors, and accumulate each product with previous contents of a corresponding single-precision element of the destination vector to produce a corresponding result element. The execution circuitry, in generating products, is to convert each 16-bit floating-point element in each pair to a single precision element by packing the 16 bits of the 16-bit floating-point element into the upper 16 bits of the single precision element, zeroing the lower 16 bits of the single precision element. A format of the 16-bit floating-point elements is bfloat16.