Automatic Tuning Technique Exploring Within the Hardware-Specific Constrained Parameters

This paper covers an efficient strategy for exploring the sampling parameters on auto-tuning processes. Byte/flop is considered as a performance indicator, and finding the best parameter is interpreted as an optimisation problem with some hardware-specific constrained conditions. In this work, we al...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Imamura, Toshiyuki, Naono, Ken
Format: Tagungsbericht
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:This paper covers an efficient strategy for exploring the sampling parameters on auto-tuning processes. Byte/flop is considered as a performance indicator, and finding the best parameter is interpreted as an optimisation problem with some hardware-specific constrained conditions. In this work, we also evaluate the performance of various unrolled loops both in a rank-update operation and a matrix-vector multiplication which appear in a significant operation of an eigensolver. The tuned routines running on a single processor of a Hitachi SR8000 and a Fujitsu VPP5000 record 1080 MFLOPS and 8342 MFLOPS respectively.
ISSN:0302-9743
1611-3349
DOI:10.1007/11666806_47