Improved GROMACS Scaling on Ethernet Switched Clusters

We investigated the prerequisites for decent scaling of the GROMACS 3.3 molecular dynamics (MD) code [1] on Ethernet Beowulf clusters. The code uses the MPI standard for communication between the processors and scales well on shared memory supercomputers like the IBM p690 (Regatta) and on Linux clus...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Kutzner, Carsten, van der Spoel, David, Fechner, Martin, Lindahl, Erik, Schmitt, Udo W., de Groot, Bert L., Grubmüller, Helmut
Format:	Buchkapitel
Sprache:	eng
Schlagworte:	Applied sciences Computer science control theory systems Computer systems and distributed systems. User interface Exact sciences and technology Software
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	We investigated the prerequisites for decent scaling of the GROMACS 3.3 molecular dynamics (MD) code [1] on Ethernet Beowulf clusters. The code uses the MPI standard for communication between the processors and scales well on shared memory supercomputers like the IBM p690 (Regatta) and on Linux clusters with a high-bandwidth/low latency network. On Ethernet switched clusters, however, the scaling typically breaks down as soon as more than two computational nodes are involved. For an 80k atom MD test system, exemplary speedups SpN on N CPUs are Sp8 = 6.2, Sp16 = 10 on a Myrinet dual-CPU 3 GHz Xeon cluster, Sp16 = 11 on an Infiniband dual-CPU 2.2 GHz Opteron cluster, and Sp32 = 21 on one Regatta node. However, the maximum speedup we could initially reach on our Gbit Ethernet 2 GHz Opteron cluster was Sp4 = 3 using two dual-CPU nodes. Employing more CPUs only led to slower execution (Table 1).
ISSN:	0302-9743 1611-3349
DOI:	10.1007/11846802_57