Can machine learning identify the next high-temperature superconductor? Examining extrapolation performance for materials discovery

Traditional machine learning (ML) metrics overestimate model performance for materials discovery. We introduce (1) leave-one-cluster-out cross-validation (LOCO CV) and (2) a simple nearest-neighbor benchmark to show that model performance in discovery applications strongly depends on the problem, da...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Molecular systems design & engineering 2018-10, Vol.3 (5), p.819-825
Hauptverfasser: Meredig, Bryce, Antono, Erin, Church, Carena, Hutchinson, Maxwell, Ling, Julia, Paradiso, Sean, Blaiszik, Ben, Foster, Ian, Gibbons, Brenna, Hattrick-Simpers, Jason, Mehta, Apurva, Ward, Logan
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Traditional machine learning (ML) metrics overestimate model performance for materials discovery. We introduce (1) leave-one-cluster-out cross-validation (LOCO CV) and (2) a simple nearest-neighbor benchmark to show that model performance in discovery applications strongly depends on the problem, data sampling, and extrapolation. Our results suggest that ML-guided iterative experimentation may outperform standard high-throughput screening for discovering breakthrough materials like high- T c superconductors with ML. Traditional machine learning (ML) metrics overestimate model performance for materials discovery.
ISSN:2058-9689
2058-9689
DOI:10.1039/c8me00012c