An adaptive Physics-based feature engineering approach for Machine Learning-assisted alloy discovery

•Proposing a novel adaptive and generalized method for encoding categorical features by integrating the physics-based nature of the variable.•Verifying efficacy of the proposed method in encoding the temper designations from two independent wrought Aluminum alloy database.•Comparing the performance...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Computational materials science 2023-06, Vol.226, p.112248, Article 112248
Hauptverfasser: Soofi, Yasaman J., Gu, Yijia, Liu, Jinling
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:•Proposing a novel adaptive and generalized method for encoding categorical features by integrating the physics-based nature of the variable.•Verifying efficacy of the proposed method in encoding the temper designations from two independent wrought Aluminum alloy database.•Comparing the performance of the proposed adaptive encoding method to popular encoding methods such as one-hot encoding and ordinal encoding.•Data mining to uncover the underlying statistical relations among the temper designations and the properties.•Improving the predictions of a variety of mechanical and technological properties. This study investigated the importance of integrating a physics-based perspective in feature engineering for machine learning applications in material science problems. Specifically, we studied the encoding of the variable of temper designation, which contains critical alloy manufacturing information and is commonly included as an important feature for predicting alloy properties in machine learning models. Popular encoding methods such as one-hot encoding or ordinal encoding neglect the physics-based mechanism of temper designations by considering them either totally independent or sequentially ordinal. Following the underlying physical mechanism of the temper designation variable, we propose an adaptive encoding method for temper designations by first decomposing them into categorical and numerical subunits that can be more properly encoded by one-hot encoding and ordinal encoding respectively. The proposed adaptive encoding method is investigated on two independent aluminum alloy datasets consisting of various mechanical and technological properties. Our investigations showed that the proposed adaptive encoding method outperforms traditional encoding methods in the prediction of both mechanical and technological properties. As a general encoding method, this adaptive encoding method can be applied to a variety of decomposable variables to help advance machine-learning-assisted alloy design.
ISSN:0927-0256
1879-0801
DOI:10.1016/j.commatsci.2023.112248