A Massively Parallel Implementation of the CCSD(T) Method Using the Resolution-of-the-Identity Approximation and a Hybrid Distributed/Shared Memory Parallelization Model

A parallel algorithm is described for the coupled-cluster singles and doubles method augmented with a perturbative correction for triple excitations [CCSD(T)] using the resolution-of-the-identity (RI) approximation for two-electron repulsion integrals (ERIs). The algorithm bypasses the storage of f...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Journal of chemical theory and computation 2021-08, Vol.17 (8), p.4799-4822
Hauptverfasser:	Datta, Dipayan, Gordon, Mark S
Format:	Artikel
Sprache:	eng
Schlagworte:	Algorithms Amplitudes Approximation Basis functions chemical calculations Chemistry Chemistry, Physical circuits cluster chemistry Computing costs Dimers Distributed memory Eris (dwarf planet) INORGANIC, ORGANIC, PHYSICAL, AND ANALYTICAL CHEMISTRY Integrals Mathematical analysis Mathematical models Parallel processing Physical Sciences Physics Physics, Atomic, Molecular & Chemical Quantum Electronic Structure Scaling Science & Technology Uracil
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	A parallel algorithm is described for the coupled-cluster singles and doubles method augmented with a perturbative correction for triple excitations [CCSD(T)] using the resolution-of-the-identity (RI) approximation for two-electron repulsion integrals (ERIs). The algorithm bypasses the storage of four-center ERIs by adopting an integral-direct strategy. The CCSD amplitude equations are given in a compact quasi-linear form by factorizing them in terms of amplitude-dressed three-center intermediates. A hybrid MPI/OpenMP parallelization scheme is employed, which uses the OpenMP-based shared memory model for intranode parallelization and the MPI-based distributed memory model for internode parallelization. Parallel efficiency has been optimized for all terms in the CCSD amplitude equations. Two different algorithms have been implemented for the rate-limiting terms in the CCSD amplitude equations that entail O ( N O 2 N V 4 ) and O ( N O 3 N V 3 ) -scaling computational costs, where N O and N V denote the number of correlated occupied and virtual orbitals, respectively. One of the algorithms assembles the four-center ERIs requiring N V 4 and N O 2 N V 2-scaling memory costs in a distributed manner on a number of MPI ranks, while the other algorithm completely bypasses the assembling of quartic memory-scaling ERIs and thus largely reduces the memory demand. It is demonstrated that the former memory-expensive algorithm is faster on a few hundred cores, while the latter memory-economic algorithm shows a better strong scaling in the limit of a few thousand cores. The program is shown to exhibit a near-linear scaling, in particular for the compute-intensive triples correction step, on up to 8000 cores. The performance of the program is demonstrated via calculations involving molecules with 24–51 atoms and up to 1624 atomic basis functions. As the first application, the complete basis set (CBS) limit for the interaction energy of the π-stacked uracil dimer from the S66 data set has been investigated. This work reports the first calculation of the interaction energy at the CCSD(T)/aug-cc-pVQZ level without local orbital approximation. The CBS limit for the CCSD correlation contribution to the interaction energy was found to be −8.01 kcal/mol, which agrees very well with the value −7.99 kcal/mol reported by Schmitz, Hättig, and Tew [ Phys. Chem. Chem. Phys. 2014, 16, 22167−22178 ]. The CBS limit for the total interaction energy was estimated to be −9.64 kcal/mol.
ISSN:	1549-9618 1549-9626
DOI:	10.1021/acs.jctc.1c00389