Predicting the Risk of Overweight and Obesity in Madrid—A Binary Classification Approach with Evolutionary Feature Selection

In this paper, we experimented with a set of machine-learning classifiers for predicting the risk of a person being overweight or obese, taking into account his/her dietary habits and socioeconomic information. We investigate with ten different machine-learning algorithms combined with four feature-...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Applied sciences 2022-08, Vol.12 (16), p.8251
Hauptverfasser:	Parra, Daniel, Gutiérrez-Gallego, Alberto, Garnica, Oscar, Velasco, Jose Manuel, Zekri-Nechar, Khaoula, Zamorano-León, José J., Heras, Natalia de las, Hidalgo, J. Ignacio
Format:	Artikel
Sprache:	eng
Schlagworte:	Algorithms Artificial intelligence Body mass index Body weight Classification Datasets Evolution evolutionary computing Feature selection genetic algorithm Genetic algorithms Learning algorithms Lifestyles Machine learning Obesity Overweight Random variables
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	In this paper, we experimented with a set of machine-learning classifiers for predicting the risk of a person being overweight or obese, taking into account his/her dietary habits and socioeconomic information. We investigate with ten different machine-learning algorithms combined with four feature-selection strategies (two evolutionary feature-selection methods, one feature selection from the literature, and no feature selection). We tackle the problem under a binary classification approach with evolutionary feature selection. In particular, we use a genetic algorithm to select the set of variables (features) that optimize the accuracy of the classifiers. As an additional contribution, we designed a variant of the Stud GA, a particular structure of the selection operator of individuals where a reduced set of elitist solutions dominate the process. The genetic algorithm uses a direct binary encoding, allowing a more efficient evaluation of the individuals. We use a dataset with information from more than 1170 people in the Spanish Region of Madrid. Both evolutionary and classical feature-selection methods were successfully applied to Gradient Boosting and Decision Tree algorithms, reaching values up to 79% and increasing the average accuracy by two points, respectively.
ISSN:	2076-3417 2076-3417
DOI:	10.3390/app12168251