A multi-strategy hybrid machine learning model for predicting glass-formation ability of metallic glasses based on imbalanced datasets

•Select key features that affect glass-forming ability (GFA) by genetic algorithms.•Enhance data with the SMOGN technique to solve the problem of unbalanced data.•Stacking models were developed to predict the GFA of metallic glasses.•The SHAP method gave a rational explanation of how different featu...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Journal of non-crystalline solids 2023-12, Vol.621, p.122645, Article 122645
Hauptverfasser: Liu, Xiaowei, Long, Zhilin, Zhang, Wei, Yang, Lingming, Li, Zhuang
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:•Select key features that affect glass-forming ability (GFA) by genetic algorithms.•Enhance data with the SMOGN technique to solve the problem of unbalanced data.•Stacking models were developed to predict the GFA of metallic glasses.•The SHAP method gave a rational explanation of how different features influence GFA. The glass-forming ability (GFA) is essential to developing and broadly applying metallic glasses. However, existing research rarely considers the imbalance of GFA data, which leads to low prediction accuracy and generalization ability of machine learning (ML) models. In this study, three strategies, a genetic algorithm to select features, a SMOGN technique to enhance data, and a stacking model to train data, are used to solve the problem of low prediction accuracy of unbalanced data. The cross-validated R2=0.85 and testing R2=0.87 for the stacking model incorporating the three strategies are much higher than the currently reported models. In addition, the ML model reasonably explains how different features affect GFA through the shapely additive explanation (SHAP) method, which deepens the understanding of GFA. This study demonstrates the effectiveness of the proposed strategy for GFA prediction in an imbalanced data environment.
ISSN:0022-3093
1873-4812
DOI:10.1016/j.jnoncrysol.2023.122645