Neural network training method for materials science based on multi-source databases

The fourth paradigm of science has achieved great success in material discovery and it highlights the sharing and interoperability of data. However, most material data are scattered among various research institutions, and a big data transmission will consume significant bandwidth and tremendous tim...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Scientific reports 2022-09, Vol.12 (1), p.15326-15326, Article 15326
Hauptverfasser: Guo, Jialong, Chen, Ziyi, Liu, Zhiwei, Li, Xianwei, Xie, Zhiyuan, Wang, Zongguo, Wang, Yangang
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The fourth paradigm of science has achieved great success in material discovery and it highlights the sharing and interoperability of data. However, most material data are scattered among various research institutions, and a big data transmission will consume significant bandwidth and tremendous time. At the meanwhile, some data owners prefer to protect the data and keep their initiative in the cooperation. This dilemma gradually leads to the “data island” problem, especially in material science. To attack the problem and make full use of the material data, we propose a new strategy of neural network training based on multi-source databases. In the whole training process, only model parameters are exchanged and no any external access or connection to the local databases. We demonstrate its validity by training a model characterizing material structure and its corresponding formation energy, based on two and four local databases, respectively. The results show that the obtained model accuracy trained by this method is almost the same to that obtained from a single database combining all the local ones. Moreover, different communication frequencies between the client and server are also studied to improve the model training efficiency, and an optimal frequency is recommended.
ISSN:2045-2322
2045-2322
DOI:10.1038/s41598-022-19426-8