Why Do Students Drop Out? University Dropout Prediction and Associated Factor Analysis Using Machine Learning Techniques

Graduation and dropout rates have always been a serious consideration for educational institutions and students. High dropout rates negatively impact both the lives of individual students and institutions. To address this problem, this study examined university dropout prediction using academic, dem...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Kim, Sean, Yoo, Eliot, Kim, Samuel
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title
container_volume
creator Kim, Sean
Yoo, Eliot
Kim, Samuel
description Graduation and dropout rates have always been a serious consideration for educational institutions and students. High dropout rates negatively impact both the lives of individual students and institutions. To address this problem, this study examined university dropout prediction using academic, demographic, socioeconomic, and macroeconomic data types. Additionally, we performed associated factor analysis to analyze which type of data would be most influential on the performance of machine learning models in predicting graduation and dropout status. These features were used to train four binary classifiers to determine if students would graduate or drop out. The overall performance of the classifiers in predicting dropout status had an average ROC-AUC score of 0.935. The data type most influential to the model performance was found to be academic data, with the average ROC-AUC score dropping from 0.935 to 0.811 when excluding all academic-related features from the data set. Preliminary results indicate that a correlation does exist between data types and dropout status.
doi_str_mv 10.48550/arxiv.2310.10987
format Article
fullrecord <record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2310_10987</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2310_10987</sourcerecordid><originalsourceid>FETCH-LOGICAL-a677-e2732d3c146a3efa43cf40372024dd6b6b2747d191d40fc30660231f5f149f443</originalsourceid><addsrcrecordid>eNotkL1OwzAYRb0woMIDMPG9QIpjO3EzoailgBRUJFIxRq5_iKViF9upmrenDUxXOsOVzkHoLsdztigK_CDCyR7nhJ5BjqsFv0anz36ElYePNCjtUoRV8AfYDOkRts4edYg2jRP0Q4L3oJWVyXoHwimoY_TSiqQVrIVMPkDtxH6MNsI2WvcFb0L21mlotAjuAlote2d_Bh1v0JUR-6hv_3eG2vVTu3zJms3z67JuMlFynmnCKVFU5qwUVBvBqDQMU04wYUqVu3JHOOMqr3LFsJEUlyU-65nC5KwyjNEZuv-7ndS7Q7DfIozdJUE3JaC_v0dXdQ</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Why Do Students Drop Out? University Dropout Prediction and Associated Factor Analysis Using Machine Learning Techniques</title><source>arXiv.org</source><creator>Kim, Sean ; Yoo, Eliot ; Kim, Samuel</creator><creatorcontrib>Kim, Sean ; Yoo, Eliot ; Kim, Samuel</creatorcontrib><description>Graduation and dropout rates have always been a serious consideration for educational institutions and students. High dropout rates negatively impact both the lives of individual students and institutions. To address this problem, this study examined university dropout prediction using academic, demographic, socioeconomic, and macroeconomic data types. Additionally, we performed associated factor analysis to analyze which type of data would be most influential on the performance of machine learning models in predicting graduation and dropout status. These features were used to train four binary classifiers to determine if students would graduate or drop out. The overall performance of the classifiers in predicting dropout status had an average ROC-AUC score of 0.935. The data type most influential to the model performance was found to be academic data, with the average ROC-AUC score dropping from 0.935 to 0.811 when excluding all academic-related features from the data set. Preliminary results indicate that a correlation does exist between data types and dropout status.</description><identifier>DOI: 10.48550/arxiv.2310.10987</identifier><language>eng</language><subject>Computer Science - Computers and Society ; Computer Science - Learning</subject><creationdate>2023-10</creationdate><rights>http://creativecommons.org/licenses/by/4.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,776,881</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2310.10987$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2310.10987$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Kim, Sean</creatorcontrib><creatorcontrib>Yoo, Eliot</creatorcontrib><creatorcontrib>Kim, Samuel</creatorcontrib><title>Why Do Students Drop Out? University Dropout Prediction and Associated Factor Analysis Using Machine Learning Techniques</title><description>Graduation and dropout rates have always been a serious consideration for educational institutions and students. High dropout rates negatively impact both the lives of individual students and institutions. To address this problem, this study examined university dropout prediction using academic, demographic, socioeconomic, and macroeconomic data types. Additionally, we performed associated factor analysis to analyze which type of data would be most influential on the performance of machine learning models in predicting graduation and dropout status. These features were used to train four binary classifiers to determine if students would graduate or drop out. The overall performance of the classifiers in predicting dropout status had an average ROC-AUC score of 0.935. The data type most influential to the model performance was found to be academic data, with the average ROC-AUC score dropping from 0.935 to 0.811 when excluding all academic-related features from the data set. Preliminary results indicate that a correlation does exist between data types and dropout status.</description><subject>Computer Science - Computers and Society</subject><subject>Computer Science - Learning</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNotkL1OwzAYRb0woMIDMPG9QIpjO3EzoailgBRUJFIxRq5_iKViF9upmrenDUxXOsOVzkHoLsdztigK_CDCyR7nhJ5BjqsFv0anz36ElYePNCjtUoRV8AfYDOkRts4edYg2jRP0Q4L3oJWVyXoHwimoY_TSiqQVrIVMPkDtxH6MNsI2WvcFb0L21mlotAjuAlote2d_Bh1v0JUR-6hv_3eG2vVTu3zJms3z67JuMlFynmnCKVFU5qwUVBvBqDQMU04wYUqVu3JHOOMqr3LFsJEUlyU-65nC5KwyjNEZuv-7ndS7Q7DfIozdJUE3JaC_v0dXdQ</recordid><startdate>20231017</startdate><enddate>20231017</enddate><creator>Kim, Sean</creator><creator>Yoo, Eliot</creator><creator>Kim, Samuel</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20231017</creationdate><title>Why Do Students Drop Out? University Dropout Prediction and Associated Factor Analysis Using Machine Learning Techniques</title><author>Kim, Sean ; Yoo, Eliot ; Kim, Samuel</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a677-e2732d3c146a3efa43cf40372024dd6b6b2747d191d40fc30660231f5f149f443</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Computer Science - Computers and Society</topic><topic>Computer Science - Learning</topic><toplevel>online_resources</toplevel><creatorcontrib>Kim, Sean</creatorcontrib><creatorcontrib>Yoo, Eliot</creatorcontrib><creatorcontrib>Kim, Samuel</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Kim, Sean</au><au>Yoo, Eliot</au><au>Kim, Samuel</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Why Do Students Drop Out? University Dropout Prediction and Associated Factor Analysis Using Machine Learning Techniques</atitle><date>2023-10-17</date><risdate>2023</risdate><abstract>Graduation and dropout rates have always been a serious consideration for educational institutions and students. High dropout rates negatively impact both the lives of individual students and institutions. To address this problem, this study examined university dropout prediction using academic, demographic, socioeconomic, and macroeconomic data types. Additionally, we performed associated factor analysis to analyze which type of data would be most influential on the performance of machine learning models in predicting graduation and dropout status. These features were used to train four binary classifiers to determine if students would graduate or drop out. The overall performance of the classifiers in predicting dropout status had an average ROC-AUC score of 0.935. The data type most influential to the model performance was found to be academic data, with the average ROC-AUC score dropping from 0.935 to 0.811 when excluding all academic-related features from the data set. Preliminary results indicate that a correlation does exist between data types and dropout status.</abstract><doi>10.48550/arxiv.2310.10987</doi><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier DOI: 10.48550/arxiv.2310.10987
ispartof
issn
language eng
recordid cdi_arxiv_primary_2310_10987
source arXiv.org
subjects Computer Science - Computers and Society
Computer Science - Learning
title Why Do Students Drop Out? University Dropout Prediction and Associated Factor Analysis Using Machine Learning Techniques
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-28T18%3A26%3A24IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Why%20Do%20Students%20Drop%20Out?%20University%20Dropout%20Prediction%20and%20Associated%20Factor%20Analysis%20Using%20Machine%20Learning%20Techniques&rft.au=Kim,%20Sean&rft.date=2023-10-17&rft_id=info:doi/10.48550/arxiv.2310.10987&rft_dat=%3Carxiv_GOX%3E2310_10987%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true