Optimization on selecting XGBoost hyperparameters using meta‐learning

With computational evolution, there has been a growth in the number of machine learning algorithms and they became more complex and robust. A greater challenge is upon faster and more practical ways to find hyperparameters that will set up each algorithm individually. This article aims to use meta‐l...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Expert systems 2024-09, Vol.41 (9), p.n/a
Hauptverfasser: Lima Marinho, Tiago, Nascimento, Diego Carvalho, Pimentel, Bruno Almeida
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:With computational evolution, there has been a growth in the number of machine learning algorithms and they became more complex and robust. A greater challenge is upon faster and more practical ways to find hyperparameters that will set up each algorithm individually. This article aims to use meta‐learning as a practicable solution for recommending hyperparameters from similar datasets, through their meta‐features structures, than to adopt the already trained XGBoost parameters for a new database. This reduced computational costs and also aimed to make real‐time decision‐making feasible or reduce any extra costs for companies for new information. The experimental results, adopting 198 data sets, attested to the success of the heuristics application using meta‐learning to compare datasets structure analysis. Initially, a characterization of the datasets was performed by combining three groups of meta‐features (general, statistical, and info‐theory), so that there would be a way to compare the similarity between sets and, thus, apply meta‐learning to recommend the hyperparameters. Later, the appropriate number of sets to characterize the XGBoost turning was tested. The obtained results were promising, showing an improved performance in the accuracy of the XGBoost, k = {4 − 6}, using the average of the hyperparameters values and, comparing to the standard grid‐search hyperparameters set by default, it was obtained that, in 78.28% of the datasets, the meta‐learning methodology performed better. This study, therefore, shows that the adoption of meta‐learning is a competitive alternative to generalize the XGBoost model, expecting better statistics performance (accuracy etc.) rather than adjusting to a single/particular model.
ISSN:0266-4720
1468-0394
DOI:10.1111/exsy.13611