Parallel variable-band Choleski solvers for computational structural analysis applications on vector multiprocessor supercomputers

A Choleski method used to solve linear systems of equations that arise in large scale structural analyses is described. The method uses a novel variable-band stroage scheme and is structured to exploit fast local memory caches while minimizing data access delays between main memory and vector regist...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Computing systems in engineering 1991, Vol.2 (2), p.183-196
Hauptverfasser: Poole, E.L., Overman, A.L.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:A Choleski method used to solve linear systems of equations that arise in large scale structural analyses is described. The method uses a novel variable-band stroage scheme and is structured to exploit fast local memory caches while minimizing data access delays between main memory and vector registers. Several parallel implementations of this method are described for the CRAY-2 and CRAY Y-MP computers demonstrating the use of microtasking and autotasking directives. A portable parallel language, FORCE, is also used for two different parallel implementations, demonstrating the use of CRAY macrotasking. Results are presented comparing the matrix factorization times for three representative structural analysis problems from runs made in both dedicated and multi-user modes on both the CRAY-2 and CRAY Y-MP computers. CPU and wall clock timings are given for the various parallel methods and are compared to single processor timings of the same algorithm. Computation rates over 1 GIGAFLOP (1 billion floating point operations per second) on a four processor CRAY-2 and over 2 GIGAFLOPS on an eight processor CRAY Y-MP are demonstrated as measured by wall clock time in a dedicated environment. Reduced wall clock times for the parallel methods relative to the single processor implementation of the same Choleski algorithm are also demonstrated for runs made in multi-user mode.
ISSN:0956-0521
1873-6211
DOI:10.1016/0956-0521(91)90019-2