Array optimizations for parallel implementations of high productivity languages

This paper presents an interprocedural rank analysis algorithm to automatically infer ranks of arrays in X10, a language that supports rank-independent specification of loop and array computations using regions and points. We use the rank analysis information to enable storage transformations on arr...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Joyner, M., Budimlic, Z., Sarkar, V., Rui Zhang
Format: Tagungsbericht
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:This paper presents an interprocedural rank analysis algorithm to automatically infer ranks of arrays in X10, a language that supports rank-independent specification of loop and array computations using regions and points. We use the rank analysis information to enable storage transformations on arrays. We evaluate a transformation that converts high-level multidimensional X10 arrays into lower-level multidimensional Java arrays, when legal to do so. Preliminary performance results for a set of parallel computational benchmarks on a 64-way AIX Power5+ SMP machine show that our optimizations deliver performance that rivals the performance of lower-level, hand-tuned code with explicit loops and array accesses, and up to two orders of magnitude faster than unoptimized, high-level X10 programs. The results show that our optimizations also help improve the scalability of X10 programs by demonstrating that relative performance improvements over the unoptimized versions increase as we scale the parallelism from 1 CPU to 64 CPUs.
ISSN:1530-2075
DOI:10.1109/IPDPS.2008.4536185