Parallel nested dissection

Nested dissection is a very popular direct method for solving sparse linear systems that arise from finite difference and finite element methods. Worley and Schreiber [16] give a fine grain algorithm for a square array of processors. Their algorithm uses O( N 2) processors, each with O( N) memory, t...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Parallel computing 1990-12, Vol.16 (2), p.139-156
1. Verfasser: Conroy, John M
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Nested dissection is a very popular direct method for solving sparse linear systems that arise from finite difference and finite element methods. Worley and Schreiber [16] give a fine grain algorithm for a square array of processors. Their algorithm uses O( N 2) processors, each with O( N) memory, to factor an N 2 by N 2 sparse matrix whose graphs is an N × N mesh. The efficiency of their method is between 1 46 and 1 12 . George et al. [6] [8] give a medium grain algorithm for hypercube architecture, while George et al. [7] give an algorithm for shared memory machines. These papers present a column oriented approach which can exploit O( N) parallelism and yield efficiencies up to 50%. Lucas [11] also gives a column oriented scheme which achieves up to 75% efficiency and O( N) parallelism. In this paper, we present a medium to fine grain algorithm for a P × P array of processors with local memory. This algorithm can exploit up to O( N 2) parallelism. The efficiency of the fine grain version is comparable to [16] while as a medium grain algorithm achieves about 49% efficiency. The strength of the method is due to three factors: its ability to pipeline much of the computation, overlapping computation and communication, and the use of level 3 BLAS like primitives. In addition to its high efficiency its memory requirement is optimal, only O(N 2 log N P 2 ) words memory is needed per processor.
ISSN:0167-8191
1872-7336
DOI:10.1016/0167-8191(90)90054-D