A machine learning approximation of the 2015 Portuguese high school student grades: A hybrid approach
This article uses an anonymous 2014–15 school year dataset from the Directorate-General for Statistics of Education and Science (DGEEC) of the Portuguese Ministry of Education as a means to carry out a predictive power comparison between the classic multilinear regression model and a chosen set of m...
Gespeichert in:
Veröffentlicht in: | Education and information technologies 2021-03, Vol.26 (2), p.1527-1547 |
---|---|
Hauptverfasser: | , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 1547 |
---|---|
container_issue | 2 |
container_start_page | 1527 |
container_title | Education and information technologies |
container_volume | 26 |
creator | Costa-Mendes, Ricardo Oliveira, Tiago Castelli, Mauro Cruz-Jesus, Frederico |
description | This article uses an anonymous 2014–15 school year dataset from the Directorate-General for Statistics of Education and Science (DGEEC) of the Portuguese Ministry of Education as a means to carry out a predictive power comparison between the classic multilinear regression model and a chosen set of machine learning algorithms. A multilinear regression model is used in parallel with random forest, support vector machine, artificial neural network and extreme gradient boosting machine stacking ensemble implementations. Designing a hybrid analysis is intended where classical statistical analysis and artificial intelligence algorithms are blended to augment the ability to retain valuable conclusions and well-supported results. The machine learning algorithms attain a higher level of predictive ability. In addition, the stacking appropriateness increases as the base learner output correlation matrix determinant increases and the random forest feature importance empirical distributions are correlated with the structure of
p
-values and the statistical significance test ascertains of the multiple linear model. An information system that supports the nationwide education system should be designed and further structured to collect meaningful and precise data about the full range of academic achievement antecedents. The article concludes that no evidence is found in favour of smaller classes. |
doi_str_mv | 10.1007/s10639-020-10316-y |
format | Article |
fullrecord | <record><control><sourceid>gale_proqu</sourceid><recordid>TN_cdi_proquest_journals_2503196935</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><galeid>A713712797</galeid><ericid>EJ1292193</ericid><sourcerecordid>A713712797</sourcerecordid><originalsourceid>FETCH-LOGICAL-c452t-2420fbd36736310f4f04d9132b1816bde59fa004718ee28e78a069c6d446f6e53</originalsourceid><addsrcrecordid>eNp9UUtr3DAQNqWBpkn_QKEg6NnpjGRLVm9LSPogkBzas9DKI1thV9pKXuj--6p16QNK0WGEvsdo5mualwhXCKDeFAQpdAscWgSBsj09ac6xV6JVEoan9S4ktFz06lnzvJRHANCq4-cNbdjeujlEYjuyOYY4MXs45PQ17O0SUmTJs2UmxgF79pDycpyOVIjNYZpZcXNKO1aW40hxYVO2I5W3bMPm0zaHcXWq9pfNmbe7Qi9-1ovm8-3Np-v37d39uw_Xm7vWdT1fWt5x8NtRSCWkQPCdh27UKPgWB5TbkXrtLUCncCDiA6nBgtROjl0nvaReXDSvV9_a9kv95mIe0zHH2tLwvu5FSy3-YE12RyZEn5Zs3T4UZzYKhUKutKqsq3-w6hlpH1yK5EN9_0vAV4HLqZRM3hxyXWI-GQTzPSWzpmRqSuZHSuZURa9WEeXgfgluPiLXHLWouFjxUrE4Uf490X9cvwHq3pxE</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2503196935</pqid></control><display><type>article</type><title>A machine learning approximation of the 2015 Portuguese high school student grades: A hybrid approach</title><source>SpringerLink Journals - AutoHoldings</source><creator>Costa-Mendes, Ricardo ; Oliveira, Tiago ; Castelli, Mauro ; Cruz-Jesus, Frederico</creator><creatorcontrib>Costa-Mendes, Ricardo ; Oliveira, Tiago ; Castelli, Mauro ; Cruz-Jesus, Frederico</creatorcontrib><description>This article uses an anonymous 2014–15 school year dataset from the Directorate-General for Statistics of Education and Science (DGEEC) of the Portuguese Ministry of Education as a means to carry out a predictive power comparison between the classic multilinear regression model and a chosen set of machine learning algorithms. A multilinear regression model is used in parallel with random forest, support vector machine, artificial neural network and extreme gradient boosting machine stacking ensemble implementations. Designing a hybrid analysis is intended where classical statistical analysis and artificial intelligence algorithms are blended to augment the ability to retain valuable conclusions and well-supported results. The machine learning algorithms attain a higher level of predictive ability. In addition, the stacking appropriateness increases as the base learner output correlation matrix determinant increases and the random forest feature importance empirical distributions are correlated with the structure of
p
-values and the statistical significance test ascertains of the multiple linear model. An information system that supports the nationwide education system should be designed and further structured to collect meaningful and precise data about the full range of academic achievement antecedents. The article concludes that no evidence is found in favour of smaller classes.</description><identifier>ISSN: 1360-2357</identifier><identifier>EISSN: 1573-7608</identifier><identifier>DOI: 10.1007/s10639-020-10316-y</identifier><language>eng</language><publisher>New York: Springer US</publisher><subject>Academic Achievement ; Academic grading ; Algorithms ; Analysis ; Artificial Intelligence ; Class Size ; Computation ; Computer Appl. in Social and Behavioral Sciences ; Computer Science ; Computers and Education ; Correlation ; Data Collection ; Data mining ; Education ; Educational Technology ; Electronic Learning ; Foreign Countries ; Grades (Scholastic) ; High School Students ; Information Systems ; Information Systems Applications (incl.Internet) ; Machine learning ; Mathematics ; Neural networks ; Predictive Measurement ; Regression (Statistics) ; Secondary education ; Statistical Analysis ; Statistical Significance ; User Interfaces and Human Computer Interaction</subject><ispartof>Education and information technologies, 2021-03, Vol.26 (2), p.1527-1547</ispartof><rights>The Author(s) 2020</rights><rights>COPYRIGHT 2021 Springer</rights><rights>The Author(s) 2020. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c452t-2420fbd36736310f4f04d9132b1816bde59fa004718ee28e78a069c6d446f6e53</citedby><cites>FETCH-LOGICAL-c452t-2420fbd36736310f4f04d9132b1816bde59fa004718ee28e78a069c6d446f6e53</cites><orcidid>0000-0001-6523-0809 ; 0000-0002-8793-1451 ; 0000-0002-4446-5980 ; 0000-0002-9259-4576</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://link.springer.com/content/pdf/10.1007/s10639-020-10316-y$$EPDF$$P50$$Gspringer$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://link.springer.com/10.1007/s10639-020-10316-y$$EHTML$$P50$$Gspringer$$Hfree_for_read</linktohtml><link.rule.ids>314,776,780,27901,27902,41464,42533,51294</link.rule.ids><backlink>$$Uhttp://eric.ed.gov/ERICWebPortal/detail?accno=EJ1292193$$DView record in ERIC$$Hfree_for_read</backlink></links><search><creatorcontrib>Costa-Mendes, Ricardo</creatorcontrib><creatorcontrib>Oliveira, Tiago</creatorcontrib><creatorcontrib>Castelli, Mauro</creatorcontrib><creatorcontrib>Cruz-Jesus, Frederico</creatorcontrib><title>A machine learning approximation of the 2015 Portuguese high school student grades: A hybrid approach</title><title>Education and information technologies</title><addtitle>Educ Inf Technol</addtitle><description>This article uses an anonymous 2014–15 school year dataset from the Directorate-General for Statistics of Education and Science (DGEEC) of the Portuguese Ministry of Education as a means to carry out a predictive power comparison between the classic multilinear regression model and a chosen set of machine learning algorithms. A multilinear regression model is used in parallel with random forest, support vector machine, artificial neural network and extreme gradient boosting machine stacking ensemble implementations. Designing a hybrid analysis is intended where classical statistical analysis and artificial intelligence algorithms are blended to augment the ability to retain valuable conclusions and well-supported results. The machine learning algorithms attain a higher level of predictive ability. In addition, the stacking appropriateness increases as the base learner output correlation matrix determinant increases and the random forest feature importance empirical distributions are correlated with the structure of
p
-values and the statistical significance test ascertains of the multiple linear model. An information system that supports the nationwide education system should be designed and further structured to collect meaningful and precise data about the full range of academic achievement antecedents. The article concludes that no evidence is found in favour of smaller classes.</description><subject>Academic Achievement</subject><subject>Academic grading</subject><subject>Algorithms</subject><subject>Analysis</subject><subject>Artificial Intelligence</subject><subject>Class Size</subject><subject>Computation</subject><subject>Computer Appl. in Social and Behavioral Sciences</subject><subject>Computer Science</subject><subject>Computers and Education</subject><subject>Correlation</subject><subject>Data Collection</subject><subject>Data mining</subject><subject>Education</subject><subject>Educational Technology</subject><subject>Electronic Learning</subject><subject>Foreign Countries</subject><subject>Grades (Scholastic)</subject><subject>High School Students</subject><subject>Information Systems</subject><subject>Information Systems Applications (incl.Internet)</subject><subject>Machine learning</subject><subject>Mathematics</subject><subject>Neural networks</subject><subject>Predictive Measurement</subject><subject>Regression (Statistics)</subject><subject>Secondary education</subject><subject>Statistical Analysis</subject><subject>Statistical Significance</subject><subject>User Interfaces and Human Computer Interaction</subject><issn>1360-2357</issn><issn>1573-7608</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2021</creationdate><recordtype>article</recordtype><sourceid>C6C</sourceid><sourceid>8G5</sourceid><sourceid>BENPR</sourceid><sourceid>GUQSH</sourceid><sourceid>M2O</sourceid><recordid>eNp9UUtr3DAQNqWBpkn_QKEg6NnpjGRLVm9LSPogkBzas9DKI1thV9pKXuj--6p16QNK0WGEvsdo5mualwhXCKDeFAQpdAscWgSBsj09ac6xV6JVEoan9S4ktFz06lnzvJRHANCq4-cNbdjeujlEYjuyOYY4MXs45PQ17O0SUmTJs2UmxgF79pDycpyOVIjNYZpZcXNKO1aW40hxYVO2I5W3bMPm0zaHcXWq9pfNmbe7Qi9-1ovm8-3Np-v37d39uw_Xm7vWdT1fWt5x8NtRSCWkQPCdh27UKPgWB5TbkXrtLUCncCDiA6nBgtROjl0nvaReXDSvV9_a9kv95mIe0zHH2tLwvu5FSy3-YE12RyZEn5Zs3T4UZzYKhUKutKqsq3-w6hlpH1yK5EN9_0vAV4HLqZRM3hxyXWI-GQTzPSWzpmRqSuZHSuZURa9WEeXgfgluPiLXHLWouFjxUrE4Uf490X9cvwHq3pxE</recordid><startdate>20210301</startdate><enddate>20210301</enddate><creator>Costa-Mendes, Ricardo</creator><creator>Oliveira, Tiago</creator><creator>Castelli, Mauro</creator><creator>Cruz-Jesus, Frederico</creator><general>Springer US</general><general>Springer</general><general>Springer Nature B.V</general><scope>C6C</scope><scope>7SW</scope><scope>BJH</scope><scope>BNH</scope><scope>BNI</scope><scope>BNJ</scope><scope>BNO</scope><scope>ERI</scope><scope>PET</scope><scope>REK</scope><scope>WWN</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>0-V</scope><scope>3V.</scope><scope>7XB</scope><scope>88B</scope><scope>8FK</scope><scope>8G5</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>ALSLI</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>CCPQU</scope><scope>CJNVE</scope><scope>DWQXO</scope><scope>GNUQQ</scope><scope>GUQSH</scope><scope>M0P</scope><scope>M2O</scope><scope>MBDVC</scope><scope>PQEDU</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>Q9U</scope><orcidid>https://orcid.org/0000-0001-6523-0809</orcidid><orcidid>https://orcid.org/0000-0002-8793-1451</orcidid><orcidid>https://orcid.org/0000-0002-4446-5980</orcidid><orcidid>https://orcid.org/0000-0002-9259-4576</orcidid></search><sort><creationdate>20210301</creationdate><title>A machine learning approximation of the 2015 Portuguese high school student grades: A hybrid approach</title><author>Costa-Mendes, Ricardo ; Oliveira, Tiago ; Castelli, Mauro ; Cruz-Jesus, Frederico</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c452t-2420fbd36736310f4f04d9132b1816bde59fa004718ee28e78a069c6d446f6e53</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2021</creationdate><topic>Academic Achievement</topic><topic>Academic grading</topic><topic>Algorithms</topic><topic>Analysis</topic><topic>Artificial Intelligence</topic><topic>Class Size</topic><topic>Computation</topic><topic>Computer Appl. in Social and Behavioral Sciences</topic><topic>Computer Science</topic><topic>Computers and Education</topic><topic>Correlation</topic><topic>Data Collection</topic><topic>Data mining</topic><topic>Education</topic><topic>Educational Technology</topic><topic>Electronic Learning</topic><topic>Foreign Countries</topic><topic>Grades (Scholastic)</topic><topic>High School Students</topic><topic>Information Systems</topic><topic>Information Systems Applications (incl.Internet)</topic><topic>Machine learning</topic><topic>Mathematics</topic><topic>Neural networks</topic><topic>Predictive Measurement</topic><topic>Regression (Statistics)</topic><topic>Secondary education</topic><topic>Statistical Analysis</topic><topic>Statistical Significance</topic><topic>User Interfaces and Human Computer Interaction</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Costa-Mendes, Ricardo</creatorcontrib><creatorcontrib>Oliveira, Tiago</creatorcontrib><creatorcontrib>Castelli, Mauro</creatorcontrib><creatorcontrib>Cruz-Jesus, Frederico</creatorcontrib><collection>Springer Nature OA Free Journals</collection><collection>ERIC</collection><collection>ERIC (Ovid)</collection><collection>ERIC</collection><collection>ERIC</collection><collection>ERIC (Legacy Platform)</collection><collection>ERIC( SilverPlatter )</collection><collection>ERIC</collection><collection>ERIC PlusText (Legacy Platform)</collection><collection>Education Resources Information Center (ERIC)</collection><collection>ERIC</collection><collection>CrossRef</collection><collection>ProQuest Social Sciences Premium Collection</collection><collection>ProQuest Central (Corporate)</collection><collection>ProQuest Central (purchase pre-March 2016)</collection><collection>Education Database (Alumni Edition)</collection><collection>ProQuest Central (Alumni) (purchase pre-March 2016)</collection><collection>Research Library (Alumni Edition)</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>Social Science Premium Collection</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>ProQuest One Community College</collection><collection>Education Collection</collection><collection>ProQuest Central Korea</collection><collection>ProQuest Central Student</collection><collection>Research Library Prep</collection><collection>Education Database</collection><collection>Research Library</collection><collection>Research Library (Corporate)</collection><collection>ProQuest One Education</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>ProQuest Central Basic</collection><jtitle>Education and information technologies</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Costa-Mendes, Ricardo</au><au>Oliveira, Tiago</au><au>Castelli, Mauro</au><au>Cruz-Jesus, Frederico</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><ericid>EJ1292193</ericid><atitle>A machine learning approximation of the 2015 Portuguese high school student grades: A hybrid approach</atitle><jtitle>Education and information technologies</jtitle><stitle>Educ Inf Technol</stitle><date>2021-03-01</date><risdate>2021</risdate><volume>26</volume><issue>2</issue><spage>1527</spage><epage>1547</epage><pages>1527-1547</pages><issn>1360-2357</issn><eissn>1573-7608</eissn><abstract>This article uses an anonymous 2014–15 school year dataset from the Directorate-General for Statistics of Education and Science (DGEEC) of the Portuguese Ministry of Education as a means to carry out a predictive power comparison between the classic multilinear regression model and a chosen set of machine learning algorithms. A multilinear regression model is used in parallel with random forest, support vector machine, artificial neural network and extreme gradient boosting machine stacking ensemble implementations. Designing a hybrid analysis is intended where classical statistical analysis and artificial intelligence algorithms are blended to augment the ability to retain valuable conclusions and well-supported results. The machine learning algorithms attain a higher level of predictive ability. In addition, the stacking appropriateness increases as the base learner output correlation matrix determinant increases and the random forest feature importance empirical distributions are correlated with the structure of
p
-values and the statistical significance test ascertains of the multiple linear model. An information system that supports the nationwide education system should be designed and further structured to collect meaningful and precise data about the full range of academic achievement antecedents. The article concludes that no evidence is found in favour of smaller classes.</abstract><cop>New York</cop><pub>Springer US</pub><doi>10.1007/s10639-020-10316-y</doi><tpages>21</tpages><orcidid>https://orcid.org/0000-0001-6523-0809</orcidid><orcidid>https://orcid.org/0000-0002-8793-1451</orcidid><orcidid>https://orcid.org/0000-0002-4446-5980</orcidid><orcidid>https://orcid.org/0000-0002-9259-4576</orcidid><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 1360-2357 |
ispartof | Education and information technologies, 2021-03, Vol.26 (2), p.1527-1547 |
issn | 1360-2357 1573-7608 |
language | eng |
recordid | cdi_proquest_journals_2503196935 |
source | SpringerLink Journals - AutoHoldings |
subjects | Academic Achievement Academic grading Algorithms Analysis Artificial Intelligence Class Size Computation Computer Appl. in Social and Behavioral Sciences Computer Science Computers and Education Correlation Data Collection Data mining Education Educational Technology Electronic Learning Foreign Countries Grades (Scholastic) High School Students Information Systems Information Systems Applications (incl.Internet) Machine learning Mathematics Neural networks Predictive Measurement Regression (Statistics) Secondary education Statistical Analysis Statistical Significance User Interfaces and Human Computer Interaction |
title | A machine learning approximation of the 2015 Portuguese high school student grades: A hybrid approach |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-06T19%3A01%3A51IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-gale_proqu&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=A%20machine%20learning%20approximation%20of%20the%202015%20Portuguese%20high%20school%20student%20grades:%20A%20hybrid%20approach&rft.jtitle=Education%20and%20information%20technologies&rft.au=Costa-Mendes,%20Ricardo&rft.date=2021-03-01&rft.volume=26&rft.issue=2&rft.spage=1527&rft.epage=1547&rft.pages=1527-1547&rft.issn=1360-2357&rft.eissn=1573-7608&rft_id=info:doi/10.1007/s10639-020-10316-y&rft_dat=%3Cgale_proqu%3EA713712797%3C/gale_proqu%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2503196935&rft_id=info:pmid/&rft_galeid=A713712797&rft_ericid=EJ1292193&rfr_iscdi=true |