An inorganic ABX3 perovskite materials dataset for target property prediction and classification using machine learning
The reliability with Machine Learning (ML) techniques in novel materials discovery often depend on the quality of the dataset, in addition to the relevant features used in describing the material. In this regard, the current study presents and validates a newly processed materials dataset that can b...
Gespeichert in:
Hauptverfasser: | , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | The reliability with Machine Learning (ML) techniques in novel materials
discovery often depend on the quality of the dataset, in addition to the
relevant features used in describing the material. In this regard, the current
study presents and validates a newly processed materials dataset that can be
utilized for benchmark ML analysis, as it relates to the prediction and
classification of deterministic target properties. Originally, the dataset was
extracted from the Open Quantum Materials Database (OQMD) and contains a robust
16,323 samples of ABX3 inorganic perovskite structures. The dataset is tabular
in form and is preprocessed to include sixty-one generalized input features
that broadly describes the physicochemical, stability/geometrical, and Density
Functional Theory (DFT) target properties associated with the elemental ionic
sites in a three-dimensional ABX3 polyhedral. For validation, four different ML
models are employed to predict three distinctive target properties, namely:
formation energy, energy band gap, and crystal system. On experimentation, the
best accuracy measurements are reported at 0.013 eV/atom MAE, 0.216 eV MAE, and
85% F1, corresponding to the formation energy prediction, band gap prediction
and crystal system multi-classification, respectively. Moreover, the realized
results are compared with previous literature and as such, affirms the
resourcefulness of the current dataset for future benchmark materials analysis
via ML techniques. The preprocessed dataset and source codes are openly
available to download from github.com/chenebuah/ML_abx3_dataset. |
---|---|
DOI: | 10.48550/arxiv.2312.11335 |