A stacking ensemble machine learning method for early identification of students at risk of dropout

Early dropout of students is one of the bigger problems that universities face currently. Several machine learning techniques have been used for detecting students at risk of dropout. By using sociodemographic data and qualifications of the previous level, the accuracy of these predictive models is...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Education and information technologies 2023-09, Vol.28 (9), p.12169-12189
Hauptverfasser: Talamás-Carvajal, Juan Andrés, Ceballos, Héctor G.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Early dropout of students is one of the bigger problems that universities face currently. Several machine learning techniques have been used for detecting students at risk of dropout. By using sociodemographic data and qualifications of the previous level, the accuracy of these predictive models is good enough for implementing retention programs. In addition, by using grades of the first semesters, the accuracy of these models increases. Nevertheless, the classification errors produced by these models cause undetected students to be discarded from the retention programs, whereas students with no actual risk consume additional resources. In order to provide more accurate models, we propose the use of a stacking ensemble technique to obtain an improved combined dropout model, while using relatively few variables. The model results show values on the expected ranges for an early dropout model, but with considerably fewer features and historical information, and we show that deploying the models would be cost-efficient for the institution if applied towards an intervention program.
ISSN:1360-2357
1573-7608
DOI:10.1007/s10639-023-11682-z