Educational Data Mining to Support Programming Learning Using Problem-Solving Data

Computer programming has attracted a lot of attention in the development of information and communication technologies in the real world. Meeting the growing demand for highly skilled programmers in the ICT industry is one of the major challenges. In this point, online judge (OJ) systems enhance pro...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE access 2022, Vol.10, p.26186-26202
Hauptverfasser: Rahman, Md. Mostafizer, Watanobe, Yutaka, Matsumoto, Taku, Kiran, Rage Uday, Nakamura, Keita
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 26202
container_issue
container_start_page 26186
container_title IEEE access
container_volume 10
creator Rahman, Md. Mostafizer
Watanobe, Yutaka
Matsumoto, Taku
Kiran, Rage Uday
Nakamura, Keita
description Computer programming has attracted a lot of attention in the development of information and communication technologies in the real world. Meeting the growing demand for highly skilled programmers in the ICT industry is one of the major challenges. In this point, online judge (OJ) systems enhance programming learning and practice opportunities in addition to classroom-based learning. Consequently, OJ systems have created a large number of problem-solving data (solution codes, logs, and scores) archives that can be valuable raw materials for programming education research. In this paper, we propose an educational data mining framework to support programming learning using unsupervised algorithms. The framework includes the following sequence of steps: ( i ) problem-solving data collection (logs and scores are collected from the OJ) and preprocessing; ( ii ) MK-means clustering algorithm is used for data clustering in Euclidean space; ( iii ) statistical features are extracted from each cluster; ( iv ) frequent pattern (FP)-growth algorithm is applied to each cluster to mine data patterns and association rules; ( v ) a set of suggestions are provided on the basis of the extracted features, data patterns, and rules. Different parameters are adjusted to achieve the best results for clustering and association rule mining algorithms. For the experiment, approximately 70,000 real-world problem-solving data from 537 students of a programming course (Algorithm and Data Structures) were used. In addition, synthetic data have leveraged for experiments to demonstrate the performance of MK-means algorithm. The experimental results show that the proposed framework effectively extracts useful features, patterns, and rules from problem-solving data. Moreover, these extracted features, patterns, and rules highlight the weaknesses and the scope of possible improvements in programming learning.
doi_str_mv 10.1109/ACCESS.2022.3157288
format Article
fullrecord <record><control><sourceid>proquest_ieee_</sourceid><recordid>TN_cdi_proquest_journals_2639933012</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>9729752</ieee_id><doaj_id>oai_doaj_org_article_3bc2e137fb9744528d110ff430833438</doaj_id><sourcerecordid>2639933012</sourcerecordid><originalsourceid>FETCH-LOGICAL-c474t-4bf0781e61d4280d58ca66f041a6008a4a5fd9916a58f25ef454246d6a4150603</originalsourceid><addsrcrecordid>eNpNUV1LwzAUDaLgmPsFeyn43JnvJo9jTh1MFOuew22bjI5umWkn-O9N1yHmIbn33HvODfcgNCV4RgjWD_PFYpnnM4opnTEiMqrUFRpRInXKBJPX_-JbNGnbHY5HRUhkI_SxrE4ldLU_QJM8QgfJa32oD9uk80l-Oh596JL34LcB9vseXlsI5_qm7e9YKhq7T3PffPd5r3CHbhw0rZ1c3jHaPC0_Fy_p-u15tZiv05JnvEt54XCmiJWk4lThSqgSpHSYE5Dxf8BBuEprIkEoR4V1XHDKZSWBE4ElZmO0GnQrDztzDPUewo_xUJsz4MPWQOjqsrGGFSW1hGWu0Bnngqoqbs45zrBijDMVte4HrWPwXyfbdmbnTyHupDVUMq0Zw4TGLjZ0lcG3bbDubyrBpvfCDF6Y3gtz8SKypgOrttb-MXRGdSYo-wUSVIKp</addsrcrecordid><sourcetype>Open Website</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2639933012</pqid></control><display><type>article</type><title>Educational Data Mining to Support Programming Learning Using Problem-Solving Data</title><source>Directory of Open Access Journals</source><source>IEEE Xplore Open Access Journals</source><source>EZB Electronic Journals Library</source><creator>Rahman, Md. Mostafizer ; Watanobe, Yutaka ; Matsumoto, Taku ; Kiran, Rage Uday ; Nakamura, Keita</creator><creatorcontrib>Rahman, Md. Mostafizer ; Watanobe, Yutaka ; Matsumoto, Taku ; Kiran, Rage Uday ; Nakamura, Keita</creatorcontrib><description><![CDATA[Computer programming has attracted a lot of attention in the development of information and communication technologies in the real world. Meeting the growing demand for highly skilled programmers in the ICT industry is one of the major challenges. In this point, online judge (OJ) systems enhance programming learning and practice opportunities in addition to classroom-based learning. Consequently, OJ systems have created a large number of problem-solving data (solution codes, logs, and scores) archives that can be valuable raw materials for programming education research. In this paper, we propose an educational data mining framework to support programming learning using unsupervised algorithms. The framework includes the following sequence of steps: (<inline-formula> <tex-math notation="LaTeX">i </tex-math></inline-formula>) problem-solving data collection (logs and scores are collected from the OJ) and preprocessing; (<inline-formula> <tex-math notation="LaTeX">ii </tex-math></inline-formula>) MK-means clustering algorithm is used for data clustering in Euclidean space; (<inline-formula> <tex-math notation="LaTeX">iii </tex-math></inline-formula>) statistical features are extracted from each cluster; (<inline-formula> <tex-math notation="LaTeX">iv </tex-math></inline-formula>) frequent pattern (FP)-growth algorithm is applied to each cluster to mine data patterns and association rules; (<inline-formula> <tex-math notation="LaTeX">v </tex-math></inline-formula>) a set of suggestions are provided on the basis of the extracted features, data patterns, and rules. Different parameters are adjusted to achieve the best results for clustering and association rule mining algorithms. For the experiment, approximately 70,000 real-world problem-solving data from 537 students of a programming course (Algorithm and Data Structures) were used. In addition, synthetic data have leveraged for experiments to demonstrate the performance of MK-means algorithm. The experimental results show that the proposed framework effectively extracts useful features, patterns, and rules from problem-solving data. Moreover, these extracted features, patterns, and rules highlight the weaknesses and the scope of possible improvements in programming learning.]]></description><identifier>ISSN: 2169-3536</identifier><identifier>EISSN: 2169-3536</identifier><identifier>DOI: 10.1109/ACCESS.2022.3157288</identifier><identifier>CODEN: IAECCG</identifier><language>eng</language><publisher>Piscataway: IEEE</publisher><subject>Algorithms ; Clustering ; Clustering algorithms ; Computer programming ; Data collection ; Data mining ; Data structures ; Education ; Educational data mining ; Electronic learning ; Euclidean geometry ; Feature extraction ; Machine learning ; pattern mining ; Problem solving ; problem-solving data ; Programming ; programming learning ; Programming profession ; Raw materials ; rule mining</subject><ispartof>IEEE access, 2022, Vol.10, p.26186-26202</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2022</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c474t-4bf0781e61d4280d58ca66f041a6008a4a5fd9916a58f25ef454246d6a4150603</citedby><cites>FETCH-LOGICAL-c474t-4bf0781e61d4280d58ca66f041a6008a4a5fd9916a58f25ef454246d6a4150603</cites><orcidid>0000-0001-9368-7638 ; 0000-0002-4574-6979</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/9729752$$EHTML$$P50$$Gieee$$Hfree_for_read</linktohtml><link.rule.ids>314,776,780,860,2096,4010,27610,27900,27901,27902,54908</link.rule.ids></links><search><creatorcontrib>Rahman, Md. Mostafizer</creatorcontrib><creatorcontrib>Watanobe, Yutaka</creatorcontrib><creatorcontrib>Matsumoto, Taku</creatorcontrib><creatorcontrib>Kiran, Rage Uday</creatorcontrib><creatorcontrib>Nakamura, Keita</creatorcontrib><title>Educational Data Mining to Support Programming Learning Using Problem-Solving Data</title><title>IEEE access</title><addtitle>Access</addtitle><description><![CDATA[Computer programming has attracted a lot of attention in the development of information and communication technologies in the real world. Meeting the growing demand for highly skilled programmers in the ICT industry is one of the major challenges. In this point, online judge (OJ) systems enhance programming learning and practice opportunities in addition to classroom-based learning. Consequently, OJ systems have created a large number of problem-solving data (solution codes, logs, and scores) archives that can be valuable raw materials for programming education research. In this paper, we propose an educational data mining framework to support programming learning using unsupervised algorithms. The framework includes the following sequence of steps: (<inline-formula> <tex-math notation="LaTeX">i </tex-math></inline-formula>) problem-solving data collection (logs and scores are collected from the OJ) and preprocessing; (<inline-formula> <tex-math notation="LaTeX">ii </tex-math></inline-formula>) MK-means clustering algorithm is used for data clustering in Euclidean space; (<inline-formula> <tex-math notation="LaTeX">iii </tex-math></inline-formula>) statistical features are extracted from each cluster; (<inline-formula> <tex-math notation="LaTeX">iv </tex-math></inline-formula>) frequent pattern (FP)-growth algorithm is applied to each cluster to mine data patterns and association rules; (<inline-formula> <tex-math notation="LaTeX">v </tex-math></inline-formula>) a set of suggestions are provided on the basis of the extracted features, data patterns, and rules. Different parameters are adjusted to achieve the best results for clustering and association rule mining algorithms. For the experiment, approximately 70,000 real-world problem-solving data from 537 students of a programming course (Algorithm and Data Structures) were used. In addition, synthetic data have leveraged for experiments to demonstrate the performance of MK-means algorithm. The experimental results show that the proposed framework effectively extracts useful features, patterns, and rules from problem-solving data. Moreover, these extracted features, patterns, and rules highlight the weaknesses and the scope of possible improvements in programming learning.]]></description><subject>Algorithms</subject><subject>Clustering</subject><subject>Clustering algorithms</subject><subject>Computer programming</subject><subject>Data collection</subject><subject>Data mining</subject><subject>Data structures</subject><subject>Education</subject><subject>Educational data mining</subject><subject>Electronic learning</subject><subject>Euclidean geometry</subject><subject>Feature extraction</subject><subject>Machine learning</subject><subject>pattern mining</subject><subject>Problem solving</subject><subject>problem-solving data</subject><subject>Programming</subject><subject>programming learning</subject><subject>Programming profession</subject><subject>Raw materials</subject><subject>rule mining</subject><issn>2169-3536</issn><issn>2169-3536</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><sourceid>ESBDL</sourceid><sourceid>RIE</sourceid><sourceid>DOA</sourceid><recordid>eNpNUV1LwzAUDaLgmPsFeyn43JnvJo9jTh1MFOuew22bjI5umWkn-O9N1yHmIbn33HvODfcgNCV4RgjWD_PFYpnnM4opnTEiMqrUFRpRInXKBJPX_-JbNGnbHY5HRUhkI_SxrE4ldLU_QJM8QgfJa32oD9uk80l-Oh596JL34LcB9vseXlsI5_qm7e9YKhq7T3PffPd5r3CHbhw0rZ1c3jHaPC0_Fy_p-u15tZiv05JnvEt54XCmiJWk4lThSqgSpHSYE5Dxf8BBuEprIkEoR4V1XHDKZSWBE4ElZmO0GnQrDztzDPUewo_xUJsz4MPWQOjqsrGGFSW1hGWu0Bnngqoqbs45zrBijDMVte4HrWPwXyfbdmbnTyHupDVUMq0Zw4TGLjZ0lcG3bbDubyrBpvfCDF6Y3gtz8SKypgOrttb-MXRGdSYo-wUSVIKp</recordid><startdate>2022</startdate><enddate>2022</enddate><creator>Rahman, Md. Mostafizer</creator><creator>Watanobe, Yutaka</creator><creator>Matsumoto, Taku</creator><creator>Kiran, Rage Uday</creator><creator>Nakamura, Keita</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>ESBDL</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>7SR</scope><scope>8BQ</scope><scope>8FD</scope><scope>JG9</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>DOA</scope><orcidid>https://orcid.org/0000-0001-9368-7638</orcidid><orcidid>https://orcid.org/0000-0002-4574-6979</orcidid></search><sort><creationdate>2022</creationdate><title>Educational Data Mining to Support Programming Learning Using Problem-Solving Data</title><author>Rahman, Md. Mostafizer ; Watanobe, Yutaka ; Matsumoto, Taku ; Kiran, Rage Uday ; Nakamura, Keita</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c474t-4bf0781e61d4280d58ca66f041a6008a4a5fd9916a58f25ef454246d6a4150603</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>Algorithms</topic><topic>Clustering</topic><topic>Clustering algorithms</topic><topic>Computer programming</topic><topic>Data collection</topic><topic>Data mining</topic><topic>Data structures</topic><topic>Education</topic><topic>Educational data mining</topic><topic>Electronic learning</topic><topic>Euclidean geometry</topic><topic>Feature extraction</topic><topic>Machine learning</topic><topic>pattern mining</topic><topic>Problem solving</topic><topic>problem-solving data</topic><topic>Programming</topic><topic>programming learning</topic><topic>Programming profession</topic><topic>Raw materials</topic><topic>rule mining</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Rahman, Md. Mostafizer</creatorcontrib><creatorcontrib>Watanobe, Yutaka</creatorcontrib><creatorcontrib>Matsumoto, Taku</creatorcontrib><creatorcontrib>Kiran, Rage Uday</creatorcontrib><creatorcontrib>Nakamura, Keita</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005–Present</collection><collection>IEEE Xplore Open Access Journals</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Xplore</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics &amp; Communications Abstracts</collection><collection>Engineered Materials Abstracts</collection><collection>METADEX</collection><collection>Technology Research Database</collection><collection>Materials Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>Directory of Open Access Journals</collection><jtitle>IEEE access</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Rahman, Md. Mostafizer</au><au>Watanobe, Yutaka</au><au>Matsumoto, Taku</au><au>Kiran, Rage Uday</au><au>Nakamura, Keita</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Educational Data Mining to Support Programming Learning Using Problem-Solving Data</atitle><jtitle>IEEE access</jtitle><stitle>Access</stitle><date>2022</date><risdate>2022</risdate><volume>10</volume><spage>26186</spage><epage>26202</epage><pages>26186-26202</pages><issn>2169-3536</issn><eissn>2169-3536</eissn><coden>IAECCG</coden><abstract><![CDATA[Computer programming has attracted a lot of attention in the development of information and communication technologies in the real world. Meeting the growing demand for highly skilled programmers in the ICT industry is one of the major challenges. In this point, online judge (OJ) systems enhance programming learning and practice opportunities in addition to classroom-based learning. Consequently, OJ systems have created a large number of problem-solving data (solution codes, logs, and scores) archives that can be valuable raw materials for programming education research. In this paper, we propose an educational data mining framework to support programming learning using unsupervised algorithms. The framework includes the following sequence of steps: (<inline-formula> <tex-math notation="LaTeX">i </tex-math></inline-formula>) problem-solving data collection (logs and scores are collected from the OJ) and preprocessing; (<inline-formula> <tex-math notation="LaTeX">ii </tex-math></inline-formula>) MK-means clustering algorithm is used for data clustering in Euclidean space; (<inline-formula> <tex-math notation="LaTeX">iii </tex-math></inline-formula>) statistical features are extracted from each cluster; (<inline-formula> <tex-math notation="LaTeX">iv </tex-math></inline-formula>) frequent pattern (FP)-growth algorithm is applied to each cluster to mine data patterns and association rules; (<inline-formula> <tex-math notation="LaTeX">v </tex-math></inline-formula>) a set of suggestions are provided on the basis of the extracted features, data patterns, and rules. Different parameters are adjusted to achieve the best results for clustering and association rule mining algorithms. For the experiment, approximately 70,000 real-world problem-solving data from 537 students of a programming course (Algorithm and Data Structures) were used. In addition, synthetic data have leveraged for experiments to demonstrate the performance of MK-means algorithm. The experimental results show that the proposed framework effectively extracts useful features, patterns, and rules from problem-solving data. Moreover, these extracted features, patterns, and rules highlight the weaknesses and the scope of possible improvements in programming learning.]]></abstract><cop>Piscataway</cop><pub>IEEE</pub><doi>10.1109/ACCESS.2022.3157288</doi><tpages>17</tpages><orcidid>https://orcid.org/0000-0001-9368-7638</orcidid><orcidid>https://orcid.org/0000-0002-4574-6979</orcidid><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 2169-3536
ispartof IEEE access, 2022, Vol.10, p.26186-26202
issn 2169-3536
2169-3536
language eng
recordid cdi_proquest_journals_2639933012
source Directory of Open Access Journals; IEEE Xplore Open Access Journals; EZB Electronic Journals Library
subjects Algorithms
Clustering
Clustering algorithms
Computer programming
Data collection
Data mining
Data structures
Education
Educational data mining
Electronic learning
Euclidean geometry
Feature extraction
Machine learning
pattern mining
Problem solving
problem-solving data
Programming
programming learning
Programming profession
Raw materials
rule mining
title Educational Data Mining to Support Programming Learning Using Problem-Solving Data
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-14T15%3A59%3A58IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_ieee_&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Educational%20Data%20Mining%20to%20Support%20Programming%20Learning%20Using%20Problem-Solving%20Data&rft.jtitle=IEEE%20access&rft.au=Rahman,%20Md.%20Mostafizer&rft.date=2022&rft.volume=10&rft.spage=26186&rft.epage=26202&rft.pages=26186-26202&rft.issn=2169-3536&rft.eissn=2169-3536&rft.coden=IAECCG&rft_id=info:doi/10.1109/ACCESS.2022.3157288&rft_dat=%3Cproquest_ieee_%3E2639933012%3C/proquest_ieee_%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2639933012&rft_id=info:pmid/&rft_ieee_id=9729752&rft_doaj_id=oai_doaj_org_article_3bc2e137fb9744528d110ff430833438&rfr_iscdi=true