Representation of compounds for machine-learning prediction of physical properties

The representations of a compound, called “descriptors” or “features”, play an essential role in constructing a machine-learning model of its physical properties. In this study, we adopt a procedure for generating a set of descriptors from simple elemental and structural representations. First, it i...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Physical review. B 2017-04, Vol.95 (14), p.144110, Article 144110
Hauptverfasser: Seko, Atsuto, Hayashi, Hiroyuki, Nakayama, Keita, Takahashi, Akira, Tanaka, Isao
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The representations of a compound, called “descriptors” or “features”, play an essential role in constructing a machine-learning model of its physical properties. In this study, we adopt a procedure for generating a set of descriptors from simple elemental and structural representations. First, it is applied to a large data set composed of the cohesive energy for about 18 000 compounds computed by density functional theory calculation. As a result, we obtain a kernel ridge prediction model with a prediction error of 0.041 eV/atom, which is close to the “chemical accuracy” of 1 kcal/mol (0.043 eV/atom). A prediction model with an error of 0.071 eV/atom of the cohesive energy is obtained for the normalized prototype structures, which can be used for the practical purpose of searching for as-yet-unknown structures. The procedure is also applied to two smaller data sets, i.e., a data set of the lattice thermal conductivity for 110 compounds computed by density functional theory calculation and a data set of the experimental melting temperature for 248 compounds. We examine the effect of the descriptor sets on the efficiency of Bayesian optimization in addition to the accuracy of the kernel ridge regression models. They exhibit good predictive performances.
ISSN:2469-9950
2469-9969
DOI:10.1103/PhysRevB.95.144110