Ensemble learning for impurity prediction in high-purity indium purified via vertical zone refining

•Ensemble learning was utilized to predict impurity content in high-purity indium after vertical zone refining.•Optimal hyperparameters for XGBoost and LightGBM models were determined through Bayesian optimization.•XGBoost, LightGBM, and Ridge regression models showed comparable MAE.•The weighted fu...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Intelligent systems with applications 2024-06, Vol.22, p.200390, Article 200390
Hauptverfasser: Shang, Zhongwen, Wu, Meizhen, Peng, Jubo, Zheng, Hongxing
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:•Ensemble learning was utilized to predict impurity content in high-purity indium after vertical zone refining.•Optimal hyperparameters for XGBoost and LightGBM models were determined through Bayesian optimization.•XGBoost, LightGBM, and Ridge regression models showed comparable MAE.•The weighted fusion model exhibited the most optimal predictive performance. The complexity of raw materials and multi-step purification processes presents considerable technical challenges in establishing universally applicable process parameters for the production of high-purity metals. Machine learning has emerged as an indispensable tool in the field of materials science, facilitating the accurate prediction of target variables and accelerating process optimization, thereby yielding substantial reductions in both experimental costs and time. This study explores the utilization of high-precision machine learning models to predict the residual impurity content in high-purity indium after vertical zone refining. A dataset comprising 82 experimental datasets was employed to determine the optimal hyperparameters for XGBoost and LightGBM models through Bayesian optimization. The XGBoost and LightGBM models demonstrated mean absolute errors (MAEs) of 0.022 and 0.023, respectively, as determined via leave-one-out cross-validation (LOOCV). Their comparable predictive performance to the previously established Ridge regression model (MAE = 0.024) prompted the exploration of fusion techniques, including mean, weighted, and stacking fusion, to further enhance accuracy. Remarkably, the weighted fusion model exhibited the most optimal predictive capabilities, supported by comprehensive evaluation metrics, including an MAE of 0.020, root mean squared error (RMSE) of 0.026, and a coefficient of determination (R2 score) of 0.830. Furthermore, the SHapley Additive exPlanations (SHAP) analysis revealed a significant correlation between lower initial arsenic (As) content and reduced total post-refining impurity levels in both the XGBoost and LightGBM models. This study underscores the precision of ensemble learning in predicting residual impurity content in vertically zone-refined indium products.
ISSN:2667-3053
2667-3053
DOI:10.1016/j.iswa.2024.200390