A Clustering-based Computational Model to Group Students with Similar Programming Skills from Automatic Source Code Analysis Using Novel Features

Throughout a programming course, students develop various source code tasks. Using these tasks to track students' progress can provide clues to the strengths and weaknesses found in each learning topic. This practice allows the teacher to intervene in learning in the first few weeks of class an...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on learning technologies 2024-01, Vol.17, p.1-19
Hauptverfasser:	Silva, Davi Bernardo, Carvalho, Deborah Ribeiro, Silla, Carlos N.
Format:	Artikel
Sprache:	eng
Schlagworte:	Classroom feedback systems Clustering Codes Computer Assisted Instruction computer science education Education feature engineering Feature extraction Learning Programming Programming environments Programming profession Research Methodology Skills Source code Source coding Students Task analysis Teachers teaching programming
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Throughout a programming course, students develop various source code tasks. Using these tasks to track students' progress can provide clues to the strengths and weaknesses found in each learning topic. This practice allows the teacher to intervene in learning in the first few weeks of class and maximize student gains. However, the biggest challenge is to overcome the amount of work required of the teacher in the manual analysis of all tasks. In this context, our main research objective is to automatically group students with similar programming skills based on the analysis of their submitted source codes. Our research is applied and uses an experimental procedure. First, we prepared the database, with more than 700 real-world source code tasks written in C Language, and distributed it in five different learning topics. Afterward, we define a set of features to be extracted from each learning topic. We defined and extracted 23 features from the source code for five learning topics. Then, we preprocess our database and extract the proposed features. Finally, we grouped the students. After performing the grouping, we obtained four groups of students, which were analyzed using a cluster midpoint calculation. Our results support the monitoring of students throughout the term, offering the teacher the freedom to create new exercises and waiving the obligation of any specific programming environment. We believe that these results can support the teacher in pedagogical decisions closer to the needs of each group of students.
ISSN:	1939-1382 2372-0050
DOI:	10.1109/TLT.2023.3273926