Development of parallel methods for a 1024-processor hypercube

We have developed highly efficient parallel solutions for three practical, full-scale scientific problems: wave mechanics, fluid dynamics, and structural analysis. Several algorithmic techniques are used to keep communication and serial overhead small as both problem size and number of processors ar...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	SIAM J. Sci. Stat. Comput.; (United States) 1988-07, Vol.9 (4), p.609-638
Hauptverfasser:	GUSTAFSON, J. L, MONTRY, G. R, BENNER, R. E
Format:	Artikel
Sprache:	eng
Schlagworte:	ALGORITHMS Applied sciences Approximation ARRAY PROCESSORS COMMUNICATIONS COMPARATIVE EVALUATIONS Computer science control theory systems Computer systems performance. Reliability DATA TRANSMISSION EFFICIENCY Exact sciences and technology Fluid dynamics GENERAL AND MISCELLANEOUS//MATHEMATICS, COMPUTING, AND INFORMATION SCIENCE Laboratories Load MATHEMATICAL LOGIC MATHEMATICS Mechanics NUMERICAL SOLUTION PARALLEL PROCESSING PROGRAMMING 990210 > Supercomputers-- (1987-1989) Software TOPOLOGY
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	We have developed highly efficient parallel solutions for three practical, full-scale scientific problems: wave mechanics, fluid dynamics, and structural analysis. Several algorithmic techniques are used to keep communication and serial overhead small as both problem size and number of processors are varied. A new parameter, operation efficiency, is introduced that quantifies the tradeoff between communication and redundant computation. A 1024-processor MIMD ensemble is measured to be 502 to 637 times as fast as a single processor when problem size for the ensemble is fixed, and 1009 to 1020 times as fast as a single processor when problem size per processor is fixed. The latter measure, denoted scaled speedup, is developed and contrasted with the traditional measure of parallel speedup. The scaled-problem paradigm better reveals the capabilities of large ensembles, and permits detection of subtle hardware-induced load imbalances (such as error correction and data-dependent MFLOPS rates) that may become increasingly important as parallel processors increase in node count. Sustained performance for the applications is 70 to 130 MFLOPS, validating the massively parallel ensemble approach as a practical alternative to more conventional processing methods. The techniques presented appear extensible to even higher levels of parallelism than the 1024-processor level explored here.
ISSN:	0196-5204 1064-8275 2168-3417 1095-7197
DOI:	10.1137/0909041